NeuraLinux Bringing GenAI to the Linux desktop

Programmers neglect the non-programming stuff

Aspiring programmers may perceive programming like a playing the guitar to serenade a love at first sight: think of a fun song, write it, and play it. But in reality, it’s a concert. You have to collaborate with other musicians (version control), secure a venue and setting (devops), convince people to attend (marketing), inform attendees to get there (documentation), and hold dress rehearsals (testing). Otherwise, you’re just a bum playing your trombone in the streets with no one listening, save for the disgruntled apartment residents above you that are the source of the tomatoes.

In this case, you may be a tech student spending a lot of time on a program with Swiss-cheese documentation and a good chance of regression upon the next update. Only when I’m invested in an idea do I waste an inordinate amount of time perfecting every aspect before releasing. But if it’s for a class, then I do the bare minimum so I can continue working on the former. That is why for the longest time, I didn’t respect non-passion projects in the latter bucket, because how could the creator have gotten any satisfaction out of process and made something durable? Such projects seem like cheap ways to gild a LinkedIn profile. But someone recently changed my mind, that the purpose could be to become a better job candidate to secure financial stability at a younger age, or to meet people that lead them to their dream job. Every tech student and I grind rather tirelessly to acquire summer internships for the same end goal. As someone who wants to travel across Asia to participate in expensive cuisine, my ape brain empathizes with this point of view. Thus, I don’t blame programmers who neglect the non-programming stuff, whether they care deeply about their work or not, for two reasons.

First, they never rigorously learned how to do it. Nearing the end of my bachelor’s degree, I have had a singular unguided project across all my classes. One where I must stretch my creative muscles and account for every aspect of the project: the Junior design requirement, specifically the CREATE-X program. We take any of our potentially profitable ideas and earn college credit for developing and presenting it. I love this idea and the creative freedom it fosters, but it’s not a commonly pursued option. Most people work on Vertically Integrated Projects (VIP), which the consensus a hit or miss. The closest shared academic experience I had was in CS 2340, where in a team we had to recreate Crossy Road for Android while keeping it version controlled. Except every single feature was already pre-decided and documented by the TAs, and no one read through our BS commit messages (like the informative “Friday, yep” and “more stuff”), and the project didn’t even have to work. The class that is actually suppose to teach us how to effectively convey ideas and write proper documentation, LMC 3403, is typically taken in the last semester of college. And schools are not going to change their curriculum, because I’m sure what I’m requesting here is not feasible for a large scale academy. So maybe give them extra fun projects to do? Well students tend to half ass whatever you give them so that they can get back to doing their 17 impending assignments. We never get to make choices about the software stack, and we do what we’re told because it makes grading easier. The sole reason I have this experience is because the Doom Emacs GUI dragged me to the dark side of the force from its little WSL window at a young age. Its innocuous frontend looks very similar to the likes of Atom or Sublime, but a little digging, and you are compiling the Linux kernel so you can use LTO and a few less MB of memory. Reminds me of a certain comic… but I digress. If you’re in a tech school, then communication skills are always viewed as auxiliary.

Second, it’s much less straightforward and immediately rewarding than programming. Let’s break it up into categories to be more specific.

  • Marketing: No boy wants to make the poster presentation or be the writer for the group, because we were raised to play with firetrucks and not talk about our feelings. And there are only 12 women in CS, and they have enough problems as is (probably due to the prior sentence). Marketing is a crucial skill we ape-brained brogrammers simply have not valued and nurtured. Even though good marketing is what separates the mediocre from money-making software, we pass it to the wayside and “get to it when we get to it”.
  • Documentation: Keeping documentation up to date is like trying to get into your douchebag friend’s car who keeps inching forward every time you’re about to get in. The thing you’re actually documenting is constantly changing, so it’s a never ending game of catch-up. And there are so many places where features must be discussed, starting from your brain and ending with README.md. Depending on the scope of your project and number of contributors, this kind of persistence must get exhausting. And I know you can automate API documentation with a great deal of tools, but that’s not what I’m talking about. I’m talking about the rationale, justifying design choices, and explaining proper usage.
  • Testing: TDD is so mind numbing because you must think like a pathological schizophrenic who imagines every possible edge case that can go wrong. You think in terms of impossible inputs and buffer conditions that only materialize during Ragnarök. Starting with the unit tests feels like eating your veggies before the carbs. And it’s really hard to cover your whole codebase, so you have to make compromises and test the critical sections. However, a lack of tests makes code regression significantly harder to detect, and gives me something ChatGPT refers to as “regression anxiety,” when I feel like something has broken somewhere, but I have no means of figuring out what.
  • DevOps: Every day, 7 operational tools deprecate, but 14 are born. I completely made up that statistic, but my point is that the field is so saturated with new tools and trends that it feels like fashion. Today you will be wearing a shirt that says “Docker Swarm 4 Lyfe”, but tomorrow your friends will touting “Kubernetes Rulez”. See, in frontend development, there is one straightforward task with harmless, funny side-effects when things go wrong. Here, you must master all slices of the pie to create a trustworthy and scalable platform, and if you fail, there is hell to pay. It’s more like building a house from scratch: you must be a carpenter, plumber, electrician, bricklayer, furnisher, and if you fail, then you are homeless…

So should we delete System32 and pursue more useful hobbies like making wooden Mallard ducks because we will never write good software? Shall we don our tinfoil hats and wait for the next Y2K so we can return to a realm without computers? No, there is a silver lining. While academics seem like a bad place to inspire change, I believe LLMs come to the rescue. They are always happy to do our menial labor no matter the task, with incredible speed and accuracy. Aspiring programmers are already very comfortable with prompt engineering because again they always have 17 impending assignments. Why don’t we channel these skills into making sure no aspect of our projects stays neglected, reducing the amount of work and surface area of failure for ourselves? My goal this year is to just sit back and bounce ideas off of ChatGPT while it writes my program, start to finish.

However, the more philosophical problem is that they pose an insurmountable and growing threat to our creative muscles. If GPT-4 can not only program better than us, but design and market better than us, when exactly are we suppose to improve those soft skills? What work will be left for us? This sounds like bad news for us lowly mortals, but its also the reason Nvidia’s stock value has soared by 223.75% this year alone. Needless to say, people with a lot of money are extremely bullish toward deep learning uprooting every industry. Who knows whether those bets will pay off, whether chatbots will fundamentally change the way we live, or just service call centers and data analysts. But throwing money at the problem and using bigger models is plateauing, because AI companies are hemorrhaging money at the moment.

The problem with GPT-4 is that huge model size doesn’t just impact training, but inference. Let’s run the numbers to see why.

  • The problem with GPT-4 having a 220 billion parameter feed-forward network per “expert” is that there are at least 220 billion floating point operations per query, but that isn’t even the scary part.
  • GPT-4 has a 128k token context window, 128 heads of attention, and head dimension of 128. The total number of operations for self-attention comes out to 128,000^2 * 128^2 * 2 = 537 trillion floating point operations per query!!!
  • For my RTX 3060Ti of 16.2 TFLOPS, assuming maximum parallelization, it would take 33 seconds per query, per “expert” consulted. Not to mention its Godzilla memory footprint would literally take terabytes of memory without optimization and swapping.

So will we find a suitable sub-quadratic algorithm to replace self-attention by then? Personally, the same profound skepticism that urges me to wear Costco fashion and use Linux makes me believe we will not. Remember when the world was obsessed with 3D printing, VR, blockchain, NFTs, and LK-99? Well all of those bets turned out flatter than expected… but I digress again. At least for now, we can spend more time designing instead of plumbing our systems, which is a win for us.