AI and Security
AI is like the Internet, Forever
In the past I was hesitant to use AI for learning and understanding engineering manuals by maintenance personnel because the manuals and systems are proprietary property of the companies that make the equipment. The employees can use the manuals, but they aren’t for general consumption. They aren’t even allowed to make copies of the manuals. If I could make a local AI out of the manuals, what a great troubleshooting tool it would be, especially for new employees. They could chat with a local AI and understand how the systems they need to inspect and repair work.
If I made a local AI, could I be sure that the proprietary information doesn’t go to wherever on some memory at Deepseek? I was always apprehensive to upload the manuals to Chat GPT for security concerns, but if I could build a local AI, there would there be any security concerns?
The commenter is correct to be concerned about the public nature of AI. Just as uploading a picture to your “private” Facebook account is actually putting it out into the wilds forever, uploading anything to an Internet AI like Deepseek or ChatGPT means that it is no longer secure or private in any way.
A local AI would only be reliably secure if it was on a single machine or a local area network with no connection to the Internet. As soon as anything is connected to the Internet, you need to assume that someone, somehow, is going to make the information on that machine, or on that network, accessible to outside parties. While it is theoretically possible to keep an Internet-connected machine secure, and there is an entire industry built around the concept, the combination of humans, machines, and the Internet is reliably prone to security failures.



This use case has much more inherent security than your average internet-connected server, because the AI uses the documents in a way that not only resides exclusively in RAM (except for what could theoretically be in swap and hibernate files momentarily, but you can disable those from the operating system) but is there in a way that would be incredibly difficult to reconstruct. Only the absolute highest tier of hackers could manage this. That is, IF you have the self-discipline to manually load the files from a memory stick when you boot the server, and then remove the stick. Which in the case of Linux is not terribly often. But eventually you will start feeling lazy, put them in your root home directory "which is almost certainly totally secure" and make the Linux boot process read in the data automatically on startup. At that point, you are operating at average security.
As they say "security is a process", not a state. Being secure means constantly not doing things the easy way, because of the side effects.