Tech/Basics/Horror Stories

From lathama
< Tech‎ | Basics
Jump to navigation Jump to search

Learning Opportunities

Many stories around tech that can make nerds cringe. While the stories may not be of disaster they often show that large sums of money were wasted before asking for help. Celebrate past problems as the learning opportunities they are!


Fortune 19 Company

A company that reached position 19 in the worlds largest publicly traded companies had issues.

DC Coat Rack

A supposed backup data-centre went off line one winter day as the site had been using an emergency power shut-off switch as a coat rack. The shut-off switch is required by fire code however the industry largely uses a paddle design. This site used a large lever with a handy height to hold your coat.

Self Service Installer

While investigating an artifact issue with a Java repository I found some RPM builds that were overly large. Upon investigation I found that the developers were including alternate versions of Java in their builds and altering the pre/post package install system to install an alternate version of Java.

One ID to Rule Them All

Mass deployment and provisioning system sent up an alarm about conflicting MAC addressing in the database. This halted the business as we chased down the issue. It was found that a VMWare admin had used the default site code on 35+ locations and the MACs were generated in a predictable manner based on site code.

License Renewal

A mass deployment workflow was impacted by an expired license. After a short amount of time management got frustrated with corporate accounting and just paid for a license on a credit card. The reason it expired is that the prior expiration date the manager had used their corporate credit card to renew the license. I never saw a resolution to this issue.


Academic Research Project

A leading edge research project on the health of the Internet starting in the 1990s

Month, Day, Hour

A data set was divided up with time series labeled as MDH. The M turned out to be the Lunar Month based on an unexpected starting point. Special tools had to be built to help compare the data as the method for leap days and leap seconds in the original code.

Fire in the Data-centre

When you use a fiber landing zone for an ocean crossing landing site as a data-centre for gathering metrics. Be aware that these sites are dated/old. A water leak inside a wall caused an electrical panel to catch on fire. This wall was an external wall that had become an internal wall over time. Buckets of water and open electrical panels all to keep the international fiber running.

Climbing Wall Blocker

While working remotely in excess of 1000 miles from the office I had a strange blocker. Corporate human resources stated that they were going to disable my accounts until I came to the office to sign the liability release for a new climbing wall in the company headquarters. I shared the ultimatum with my awesome manager. After some face palms and other yelling it became clear that the HR team had not learned to digitize company policy and there were several HR documents that many of the international team had not known about. I was mailed a physical copy of the liability release to sign and return. I scanned it, signed it and emailed it to the HR head with my manager CC'd.


Printing Company

Rare language printing with global customers.

Private Network Addressing

The primary network for systems was on a non-private range causing issues reaching a customer. Upon checking there was a full 8 bit IPv4 network that started with a two digit number. This network covered most of a country at that time so work with people in that country was difficult/impossible. To make it more troublesome the in house mail server was configured in such a way that the huge network could not accept or send mail.


Manufacturing Company

Excessive Grounding

A small manufacturing company with multiple buildings had issue with bad networking gear. A lighting strike at one building was mentioned. New lighting rods were being installed. The damaged equipment was in another building. The buildings were connected by direct bury fiber optics. Upon inspection I found that all buildings were grounding the two post racks to the fiber optic shielding. One building had a direct ground for a rack. The amount of voltage on the rack ground was concerning. It took a mentor who wrote for the National Electric Code (NEC) to visit and explain what had happened. Beware of manufacturing companies who think their engineers are experts on everything.


Managed Service Provider

Small operation with impossibly frugal clients.

Malware Epidemic

During an epidemic of malware there was a large amount of systems with malware/viruses that needed cleaned. While pulling the drives and removing the problematic software was safest the owner needed more speed. Not being a Microsoft user I looked into the problem and found that the malware would be easy to remove if the watch dog process did not reinstall it. I generated a process to disable and remove all the files quickly and instructed the techs to pull the power cord or switch off the power supply the moment they run the script. This was harsh but sped the process up at lighting speed.

Everything is Stuck

A customer with a high value server outage had issues with an unresponsive system. After a quick look I found a CDROM Disc and inserted it. The Microsoft Windows Autoplay woke the system up.

Rusty Situation

After a hurried session to document an ex-employee accessing systems and then locking them out for law enforcement I needed to document the physical systems. I found the primary server for the company located on an old water heater. The desktop computer case was so rusty that the bottom was no longer solid. Galvanic Corrosion should be taught in public schools.


Major Telecom

SS7 Audio Fun

During the setup of a new phone system for a wholesale business I was made aware of some audio issues. The issues were intermittent which led me to look at the telcom circuits. After arriving with a box full of locally loved hamburgers and fries I was able to get details on which engineer was working on the new back-haul. After contacting and working though the issue we found that during the installation of a large back-haul every single fiber in the bundle was setup with the same gain/loss settings. The back-haul was configured in such a way that by accident properly configured fibers were in use first and as call volume increased the poorly configured fibers would be used. After addressing the issue with the engineer I was later told that this fixed issues with a LARGE number of businesses using this back-haul. The power of free food!


Executive Escalation

Banking Executive Escalates

A banking executive of a local bank reached out to the leadership of my clients for help on their home computer. I reluctantly visited the home and spoke with the children and the executive. A very awkward visit where the executive asked me to confirm my identity several times as they could not believe I was the expert. Once near the computer I could smell signs of burning and mentioned this. The executive and children thought this was normal. I opened the computer case and did not see any immediate signs. I unplugged the hard drive after a failed drive BIOS error. While unscrewing it from the case the executive felt it was their chance to tell me how smart they were and how stupid I was. I removed the hard drive and without looking handed it to the executive with a twisting motion to show that the PCB controller board was burnt, black and falling off the hard drive. I left and informed my clients that I will never accept escalations from them again. The bank did inquire about hiring me and I declined.


Global Orthopedic Implant Manufacturer

Anything that can get implanted to help your bones.

Production is Down

The CEO of a large enterprise called my mentor for support when a robotic manufacturing system was down for around nine hours. My mentor dispatched me as I already work with the enterprise and had full security access. I arrived to a large number of people standing around looking at the robotic manufacturing machine. I asked everyone to step away and or return to their duties. I quickly looked at the system, removed the power, inspected inside the control panel to find a small nut and washer at the bottom of the box. I quickly realized it was for a grounding stud. I re-attached the grounding wire for the DIN rails and computer. I powered the machine up and it resumed working. I left as I had other appointments that day. I met back with the management of the machine maintenance dept later in the week, a dear old friend. I was informed the production impact was about One million dollars an hour give or take. When the CEO had asked for a root cause and heard that it was a disconnected ground wire things got loud.

Kaizen Expert

While working on environmental monitoring and security systems for a large enterprise I was asked by an employee about some solutions for their Kaizen project. At issue was the speedy movement and setup of a CNC metal working machine. The machines interacted over Ethernet and at times serial. I showed the employee some options in the wireless bridge and serial over Ethernet product categories. I had some devices in my vehicle so I offered to loan them for testing. A week later I arrived to see a mass of boxes for wireless bridges and a new project to enable machine movement. Later that day I was asked by leadership about my knowledge in the area. I informed them that they are not the only company doing Kaizen and I am well versed. A meeting was setup to assist with the Kaizen project quickly. Another problem they had was plumbing water to the machines to fill the coolant tanks. Once filled they would only require minimal water to fill for evaporation. I presented the option of an atmospheric water generator. I later heard that all global operations had installed atmospheric water generators to speed machine movement. The wireless bridges and serial over Ethernet devices were validated and began rolling out. When asked about locating the devices to the machines the IT team was surprised to learn that we can drill holes in the CNC machines to solidly mount the devices directly. The CNC machines were already altered to have power for other needs which enabled the devices to be powered directly. After these and a few other smaller changes we were setup to test. Another empty warehouse with the electrical system needed was chosen. A group of forklifts, and trucks was able to move, setup and begin production quickly.

Device Imaging

Working with the help-desk to develop playbooks for the newly installed environmental monitoring systems lead to a good chat. The team was larger than I expected. The playbooks I was proposing were unique as they had few documented processes at all. I inquired and heard a near one hundred percent manual process. I inquired about their provisioning process for mass deployments. After confirming I was not speaking Greek I did an impromptu class on DHCP/PXE and the fun things that can be done. One help-desk person asked about backups for desktops which they had none. In shock I offered some options and at the time they chose to try https://fogproject.org an OSS solution. Shortly after I heard that the imaging backups saved an engineer with a bad hard-drive.

Remote Access

While much of the IT industry enjoys management tooling and features like IPMI this enterprise had no experience with them. Due in part to some strange procurement processes they had thousands of computers that required physical access to administer them. I offered to loan them a Lantronix Spider I had in my bag for them to try. A week later I returned to checkup and retrieve my Spider. There were about 50 Lantronix boxes sitting in the cubicle of my contact and everyone was very happy to use the new remote access. I asked to look at the servers with an IT manager who promptly refused. Having full security access I went to the data-centre and looked at the servers. The Lantronix Spiders were in fact installed. I noticed some servers which are known to have IPMI BMC installed. I updated the employee doing the installs that many of the models had management interfaces already. The manger was unhappy and began to make noise before the CEO pulled them aside for a chat.

Rack Requirements

A site electrical engineer had issues with a remote plant power requirements for a computer rack. I looked at the request and reported that the number was very low. The engineer was shocked as he found it to be very high. I sat down and browsed some rack-mount servers with him and did a tally of the units of a rack fully populated which more closely matched my expectation. The engineer began to connect the dots around the amount of cooling requested in the HQ data-centre was very high. With this knowledge I suggested that existing data-centre spaces be reviewed in addition to the remote site that prompted the request.

Networking is Hard

Globalization is hard for some and more difficult for others. The global business across 50+ countries were in the middle of a project for site to site network connectivity. One site was using non-private IPv4 addressing for manufacturing systems. The network engineer was not skilled in this area. I was asked to consult on the project. A plan of pushing a flat 8 bit globally and requiring all sites to change addressing was overly forceful in my opinion. I did a quick report of numbering systems and policies. I presented options like 802.1q VLANs and super-networks. The current networking team were largely unaware of the technologies I suggested. At least one major site was found to have used 192.0.0.0/8 for their network which also caused major issues. The project was paused until every site could develop a playbook for updating the IP addressing on devices that required manual changes. I learned that my CCIE study books should be shared with the team.

Hungry Electronics

While working on a clean-room reporting system I noted that the research refrigeration was not running correctly. The clean-room management had no idea that there was a problem. I spoke to the facilities team who did not know of the issue. I presented a kill-a-watt tester that measures the energy consumption of a device by fitting between the wall outlet and the plug. I instructed an intern of the facilities staff on how to use the device. I suggested that the data points be added to a report along with the price of electricity and replacement refrigeration units. Around two weeks later I returned as I needed the tester for another issue. I was given a new in box tester as the company had ordered several. In the first week they had discovered the cost of operation was larger than expected and had already ordered replacement refrigeration units. Noting how hard it was to spend money on new capital expenses I inquired with the manger. The manager shared that a new project was started to monitor power consumption at the site and later all sites globally. Management was surprised by the cost of electricity vs modern energy efficient replacements. Additionally it was found that the refrigeration units should have been on a circuit that was powered by a generator in power outage situations.


Electronics Manufacturer

The make electronics that go in every car, tractor, and plane.

Network Problems

Asked to help a brother of a client who owns and operates an automotive electronics manufacturing plant. Two hour drive to site bringing with me some spare network gear. I arrive and look at the dirty industrial plant. While the owner is distracted with another issue I ask an employee what has changed in the past month. They reported that a welder had broken and a new one was installed. I asked to see the welder. I looked at the welder and the surrounds for any clues. I asked if anything else changed. The employee said that electricians installed a new power lead for the welder along the rafters. I looked up and saw a network cable zip tied to the newly installed 50 amp power lead that spanned over 20ft. The owner returned and I asked for a ladder to remove the zip ties and isolate the network cable. The owner proceeded to inform me of my ignorance and how that could never be the issue. I left and reported the findings to the brother. I guess it was fixed later after some sibling yelling. I heard that there was a pile of network switches with one or more bad port.


Labor Union

Thousands of members supported by a tiny staff.

Phone Line Audit

A secretary at a labor union handed me a phone bill for a single building site. The bill was for many thousands of dollars. I did roughly two hours of checking and found and impossible number of unused lines. We confirmed the lines in use and then canceled the unused lines. This saved nearly $100k a year. I sat down with the head of the labor union to try to understand how this happened. It turned out that during campaigns to promote the labor union the phone lines were installed and used for phone calls to employees that were involved with the labor field. The head of the labor union assumes that no one canceled the lines. I shared the total number of lines which was several multiples of what they expected. The following months the labor union audited all the bills from all service providers.

Server Down

Called for an emergency where the server was down. I arrive and see the server is clean and sitting in a safe space. I check and there was no reported hard-disk. I open the system and begin inspecting. Beyond a bit of dust things appear normal. I decide to re-seat all the cables. Upon touching the SCSI cable to the hard-disk it fell out. I get everything plugged in and the server starts up fine. When asked for a root cause I share my findings and can only assume it was thermal cycling on a cable never properly inserted.


Leading Clothing Maker

Domain Trust

In the 1990s one of the first eCommerce pushes for the company was not selling. I was asked and looked into the issue. I attempted to order a shirt and found that the checkout process would send the user to a site lacking the company logo and an IPv4 address. I sat down with them and discussed Tech/Basics/Domain Trust where the user stops trusting the process as the address in the address bar changes to an https://203.0.113.3/cgi-bin?checkout type url that lacked the company logo in the checkout. I later learned that the expensive at the time SSL certificate was assigned to the IP due to a network engineer mistake as DNS was outside their wheelhouse.


Internet Service Provider / MSP / Webhosting (Third World)

DNS Upgrade

Upon arriving to a beautiful third world country to work on a project I found that the large ISP had two DNS servers with issues. The 486DX2 systems were having hard drive issues in 2009 for some reason. I quickly setup some modern systems and began building out a new network and software stack. I wrote a process and troubleshooting document to hand off support. At last check in 2025 the DNS servers have not had any upgrades.

VoIP Phone Menu Language

A customer using Cisco VoIP phones complained about the display language. Cisco does not provide language options for localization or l10n. Taking a risk I located the Cisco firmware encryption key and updated the firmware language packs to add a language localization for the country. We rolled out the updated firmware to nearly seventy thousand phones within a month.

Color Blind

While helping onsite with one of the newest data-centres in the country I noticed an installer having issues. The installer was well liked and doing good work. The issue was that the installer was color blind and having all the cable terminations double checked by another awesome installer. I inquired and the installer became very nervous like they were going to get fired. I spoke with the owner and we promoted the color blind installer to be a field manager.

Enabling Help-desk

With hundreds of corporate customers and thousands of networking devices like switches, routers, modems and more the help-desk had a difficult time answering customer queries. Common practice was to have everyone log onto the devices to read, learn and investigate issues. I had recently setup an instance of MediaWiki to enable company documentation. I setup a tool that would check every networking device and save the configuration to a wiki page. The help-desk was able to quickly see the configuration. A side effect was the large amount of insecure settings and plan text passwords. We began correcting all the issues. After a month the help-desk team learned to check the Special:RecentChanges at the start of their shift to look for any recent changes. Customer satisfaction improved by a dramatic amount as a fruit basket and other thanks were sent.


Commercial Wholesaler

Legacy Support

A legacy business management system was approaching a license expiration. The software vendor had been shut down for years and the legacy consultant was not responding. When the consultant responded the quoted cost to renew the software was very high. I was asked to look at the issue. I found the software written in COBOL with no source or documentation about the license location. I was shown in the software interface the license expiration date. I inspected the server and copied all files related to the software onto a Linux laptop. I also copied any and all files that changed or were created around the time of the last license renewal. I used a COBOL decompliler to inspect the program. I found that the license expiration was hard-coded into the main binary. Using a Hex Editor I located the license expiration variable in the binary and altered the year field. Over a weekend working with the company IT staff I validated the change and functionality. We also inspected the workstation client software in the same fashion. I wrote a report and my findings along with a howto for the future.


State Government

Internet Access for All

In an effort to enable that all schools in a state have internet access a teacher and a classmate worked on a proposal for the state congress to review. We explained the various benefits and features the internet can offer a school. I worked with the librarians to get a letter of support. I worked with a few classmates to find resources on the internet at the time that could not be sourced via the libraries inter-library loan system. It took over a week to find a website dedicated to documenting, explaining, and guiding the process of drilling the finger holes in bowling balls. I printed the webpage out and presented it to the librarians. They went to work doing research. Two weeks later we had multiple letters of support from various librarians at all levels of the state and federal library systems. The proposal for internet access to all schools was approved and funded by the state congress. I was told our work was instrumental in the process. Later I learned it also changed many librarians approach to the internet.