I have a good idea for this post, but it has more to do with the local aspect of AI in regards to security. In the past I was hesitant to use AI for learning and understanding engineering manuals by maintenance personnel because the manuals and systems are proprietary property of the companies that make the equipment. The employees can use the manuals, but they aren’t for general consumption. They aren’t even allowed to make copies of the manuals. If I could make a local AI out of the manuals, what a great troubleshooting tool it would be, especially for new employees. They could chat with a local AI and understand how the systems they need to inspect and repair work.
If I made a local AI, could I be sure that the proprietary information doesn’t go to wherever on some memory at Deepseek? I was always apprehensive to upload the manuals to Chat GPT for security concerns, but if I could build a local AI, there would there be any security concerns?
Yes, you can be sure of that. Neural network architectures have only two sources of information: The so-called "weights" file. That's the one that is unbelievably resource-hungry to change, and which is the only thing that's properly called "training". It's realistic to do only on a server farm specifically designed for that. And then there is the "context" data, which resides in RAM. The documents go there, and are entirely wiped out once you close the AI process. In this case by closing PowerShell. The only scenario in which traces of them could remain in a way that's recoverable by extremely high-skilled hackers (the kind that typically work for governments) is if they are left in the operating system swap file or hibernate file. This might be a concern on a Windows server, which typically uses large swap files and always leaves behind hibernates. On the other hand, Linux servers often set swap to zero, or at least very small values. You typically simply put as much RAM on a Linux server as you really need, and then don't rely on swap at all. So, on a Linux server, you can completely eliminate the risk. On a Windows server, it's tiny but existent.
Thank you
I have a good idea for this post, but it has more to do with the local aspect of AI in regards to security. In the past I was hesitant to use AI for learning and understanding engineering manuals by maintenance personnel because the manuals and systems are proprietary property of the companies that make the equipment. The employees can use the manuals, but they aren’t for general consumption. They aren’t even allowed to make copies of the manuals. If I could make a local AI out of the manuals, what a great troubleshooting tool it would be, especially for new employees. They could chat with a local AI and understand how the systems they need to inspect and repair work.
If I made a local AI, could I be sure that the proprietary information doesn’t go to wherever on some memory at Deepseek? I was always apprehensive to upload the manuals to Chat GPT for security concerns, but if I could build a local AI, there would there be any security concerns?
Yes, you can be sure of that. Neural network architectures have only two sources of information: The so-called "weights" file. That's the one that is unbelievably resource-hungry to change, and which is the only thing that's properly called "training". It's realistic to do only on a server farm specifically designed for that. And then there is the "context" data, which resides in RAM. The documents go there, and are entirely wiped out once you close the AI process. In this case by closing PowerShell. The only scenario in which traces of them could remain in a way that's recoverable by extremely high-skilled hackers (the kind that typically work for governments) is if they are left in the operating system swap file or hibernate file. This might be a concern on a Windows server, which typically uses large swap files and always leaves behind hibernates. On the other hand, Linux servers often set swap to zero, or at least very small values. You typically simply put as much RAM on a Linux server as you really need, and then don't rely on swap at all. So, on a Linux server, you can completely eliminate the risk. On a Windows server, it's tiny but existent.
Your response is very helpful. Thanks.
Thank you!