Monday, April 14, 2025

2025-04-14 - VPN Server Went Down Again!

        I was seriously starting to feel this time like if this darn thing can't stay up and running for even a month without there being some problem that takes a ton of effort to troubleshoot then the thing isn't worth using. I also concluded if I just reinstall PiVPN with WireGuard (WG) and then WG on every corresponding device that uses the VPN again, that would likely solve the problem. But this was only after exhaustive troubleshooting that went nowhere because nothing was wrong. There were no catastrophic failures, no brown-outs, no drivers, Kernels, Os's updated that would then throw it out of whack, And after multiple days of putting Proxmox Active Directory VM's aside--because everything I have centers around my VPN functioning properly, I had then concluded that just reinstalling the thing would almost certainly fix the issue. 

        I even had a friend come over that the first time he helped me with my server like three weeks ago or so, the second he sat down, the problem was cleared up because everything that was wrong with my server just vanished. It would turn on before but there was no POST light flickers from the keyboard on boot, the motherboard was just as lively as ever, but no screen output because its a server that runs headless and so when you plug something in, nothing shows up unless you reboot, and nothing worked now because tried to reboot not knowing it was doing a Kernel update and then apparently temporarily bricking the server for the next whole day until it sorted itself out I assume, then it could be rebooted and produce output for a monitor and we could enter BIOS and there were no problems. The server booted fine, everything worked perfectly as if nothing had happened. Weeks later when I tried again to do my TEMPer2 project to get a USB temperature sensor to read the temp of my room and connect to Home Assistant to then connect to my AC to trip the AC on or off, only then did I discover when running a command to check the latest version of the Kernel was installed, and it saying yes, that in fact no it was not, because then I had this problem and the boot took an inordinate amount of time while trying to see if rebooting would solve the VPN problem. I couldn't SSH back into the machine for maybe twenty minutes I think, well, that's when I tried to SSH in a fourth time anyway, and it finally worked. And that was a few days ago during the latest server troubleshoot for the VPN. 

        So once again we go through this problem, I couldn't use my VPN while out remotely, and working on my Active Directory project to make myself more hirable. And I couldn't connect. I was at BYU trying to connect to internet, sometimes it doesn't like connecting to the captive portal so I can agree to terms and conditions for visitor WIFI unless I turn off my VPN in WG. SO I turned it off and connected and then I turned it back on and it still wasn't working. But I just connected to it. What do you mean there's no internet? I checked the WG client and saw that neither tunnel would receive any packets back from my VPN server, X amount of Kb sent, 0 received. I tried then to turn off both WG tunnels and AnyDesk unattended into my home desktop and couldn't do that either. What's going on here?! I discovered after troubleshooting when I got home that in fact I accidentally deleted my token or key authentication account for unattended access to my desktop when troubleshooting another problem for my mom a few months ago, within the Free app I use I have been using this free authenticator app for years because I didn't know there were better, free options. But with this option, there was little documentation online of how to troubleshoot it, and if you created a new token or key or whatever account to use for any particular thing, which you could create several for different devices, you couldn't rename the very easily. I found this extremely frustrating. So I switched to a very popular free one called Microsoft Authenticator offers which does let you easily rename "accounts", as long as you know they call them accounts. 

        Okay so then I got that back up and running. But then I spent hours and hours troubleshooting the VPN itself now that I found I could easily ping and SSH into the server, open and use server files on my desktop through SMB, or even on my laptop if I am using the internet in my room with my server, by all appearances, nothing was wrong with the server. If I didn't have a VPN I heavily rely on, I would never know that my server had a problem. 

        But I ran out of ability to troubleshoot. Everything was working fine. No catastrophic events, all WG keys and IP addresses were entered correctly. I am not the best at looking at logs, but I checked them and couldn't even understand most of them, and did my best and still didn't find anything wrong. Every time I googled, I couldn't exactly find what I was looking for in reference to my particular problem. Googles AI Overviews, the thing at the top of most google searches, had more to say than any result I found and even when it sounded like it might be in the same relative neighborhood as my VPN problem, it would just link me to a site with a guy who was like 'yeah I don't know what I'm doing, followed instructions installing WG best I could, donno what I did wrong'. 

        ChatGPT told me whenever I explained the problem there to enter "sudo wg show" and it instructed me to look at the connection status, to see if bits or bytes or whatever was received, and that command simply didn't show anything but that there were various clients with their own tunnels, the virtual /32 IP addresses WG assigned them, the keys would be omitted, and there were no lines showing connection status for any of them. I told ChatGPT this isn't showing status and it insisted that it's supposed to be there and finally told me to check the status another way using "sudo systemctl status wg-quick@wg0" I didn't get far with Grok3 either.  

        I had restarted services several times, turned WG on my laptop on and off multiple times, rebooted the laptop, rebooted the server, that was exciting because it took a lot longer to reboot than it should have which makes me think the Kernel finally updated, I checked multiple logs, I was starting today to start double checking everything because I figured I just had to have missed something simple. My friend came over and he actually is in Information Systems, he is familiar with a lot of Information Technology stuff because I guess his job doesn't have much in the way of IT, he offered his help but said he didn't know how VPN's work so I said that was fine, I do, but I can't figure this problem out. So he came over early this morning and I started explaining WG and VPN's and how packets move on and off a network so he could differentiate from what he already knew, admitting that routing is confounding to him. I explained TCP and UDP, port forwarding, half tunnels and full tunnels and each of their pros and cons, I had drawn a diagram of my physical network that made sense in a logical fashion with the VPN taken into account. And then I drew another diagram of a cloud with a tunnel running through it with my server on one end of the tunnel and my laptop on the other end and explained the keys purpose, encryption, how the routers figure into the tunnel. I showed him the wg0.conf file, which has all the keys, tunnel names, IP ranges, IP addresses, etc. I showed him the tunnel configuration window in the WG client on my laptop and accidentally turned it on and had trouble SSHing into my server and couldn't figure out what the problem was now and then realized, oh I accidentally turned on the tunnel, so I have no internet, he didn't understand why I would have no internet so I explained that since its a full tunnel, everything, all network traffic to and from my laptop now has to go over the VPN except possibly DHCP. Maybe that's why when I connect to internet and turn on the VPN, my WIFI icon changes to a no network icon, because even DHCP can't reach the laptop so the connection is dubious. Then he wondered why it would matter, why if I turned on my VPN while at home, why would the connection have to go back outside my network and back in, and I explained that the WG client has the public IP address my ISP gave my apartments router, so when it looks for my network it has to look for that public IP address, it goes through NAT and the firewall, the firewall will only let it in if the port WG is using which is usually 51820 which is why you'll see an ipv4 address with a ":51820" at the end of it, the ports are all closed by default and only opened according to what you approve, and so if you want to search the internet with unsecured HTTP fifteen years ago or whatever, it was port 80, now it's 443 for HTTPS over SSL almost entirely, and WG gets 51820 by default and anything 51820 will go straight to my server as long as the port forwarding is configured on every router of the network local to the server. Customer side of the Demarcation point. If it's Xfinity, which is an ISP a lot of people around me have, they don't buy a huge pool of IP's for people to use and instead just a few and have their own version of NAT that they then have public addresses they assign to customers, which is why with Xfinity, you can't use a VPN. You would have to either get the only port forwarding their entire pool would allow because ports are usually only assigned to one service for security I assume, and would look that up if I didn't already have a packed schedule. 

        After Robert left and we had run out of time, I had this idea to just ask Grok what the most common problems were that fit my situation and it said that it was most likely either the Firewall or port-forwarding issues, Nat or Routing problems, MTU (maximum transmission unit) mismatch--the maximum IP packet size on ever layer 3 device is configured to different maximum sizes which can cause errors with packets that were already segmented into smaller pieces for network communication and then run into devices that can't pass them because their MTU is set to a smaller size, you really want every device set to the same MTU and if you're sending and receiving from the internet then you want your network to to conform to that, which is 1500 bytes and 1500 or 1492 when taking packet headers and trailers into account is almost always the standard and consumer routers are just automatically configured to this to a lay-person don't have to worry about this. 

        Anyway, I spent about an hour trying to get any one of the last three roommates to have control over the router to help me access it to make sure the configuration is still good because that is probably the one thing I didn't check. With the VPN being such a virtual thing, it never occurred to me to check anything physical except my own plugs in my room. I had internet so there was no obvious clue that it was the router that I was aware of. But I wanted to check. This problem was confounding enough and if I was going to reinstall Pi VPN, I would lose the opportunity to know what the heck went wrong so I can just check that next time something like this happens. 

        My current roommate with control over the internet and utilities said he was in class and was too busy to deal with this right now. He isn't an IT person so he doesn't know that it really needs to be him that does it because he would have all the credentials. I asked him if he could ask Nate who had utilities and internet last to give him his credentials and then Robert before him, and then I rolled my eyes and just texted them myself and already had both of them tell me I don't have it, Nate took over, he told me so, and Nate to say he never got it from Robert, and then Nate to look up his account online to see if he still had the account in his name which he did so my new roommate was no longer needed, and I had never been allowed to access the router myself because Robert was very cautious about messing with things that belonged to him and he had responsibility for and not knowing what he was doing so if it worked, the only ones that would touch it is the technician from the ISP and him. He would let me give him instructions and he would look at things for me, but now it's not even under his control. And Nate was sure he didn't have it either but the account said otherwise. So now he is mad at Brenner and I had to put out a fire when I already had a smoldering mess of my own and so on, so after we got all the drama out of the way, Nate asked for money because the bill that was now currently due that he apparently has to worry about is now due so Brenner and I have to pay thirty five each for the fiber connection. And then he asked more sympathetically, "Did you unplug the router for forty-five seconds?" I thought, no, but I guess it can't hurt to try. 

    So I did, I can't believe it. Everything worked after that. He called back later and I told him about it and then I said it was a good thing this happened because since he isn't exactly rolling in the dough, and didn't know he owed our ISP money, my internet wasn't shut down, the fix to my VPN was simple even if extremely frustrating and confounding, and we hadn't caught up in a while because we actually didn't particularly love living together even though we had almost nothing to do with each other with him living in the basement and me in the attic and so we almost never saw each other. And his entry and exit to the house is faster if he just walks in and out of the front door, and the stairs up to my room are right in front of the back door to the parking lot and I have a car which he doesn't have. So we just never saw each other. So I told him I thought this was a good thing and I think he agreed with me. He has since asked me to help him run an errand and so I have to run. 

This has been Truncat3d 00000000111100010100110______________end of line

No comments:

Post a Comment

2025-07-10 - Active Directory 5.0 - Group Policy Foundations: Understanding Domain Admins and User Accounts / Setting up Remote Access

  Why You Use TESTLAB\Administrator Across Multiple Machines — And Why You Need Separate Domain Users When you join a workstation to an Act...