Is there a castle behind ML moat

May 07, 2023

There was an interesting article about how neither Google nor OpenAI have any moat. It is a good article and you should read that. Note: that article is an opinion of one Google’s employees and does not represent the opinion of the company. I find it funny that it has become common to add these disclaimers when companies don’t represent opinions of their employees.

Moat (and castle behind it) is popularized in recent times by Warren Buffett. You need a moat to protect a castle. I am not sure if any of the ML models themselves are castles. They are part of castle where castle is a popular product. It is much easier to incorporate ML into one of the exisinting products to improve them, rather than build a product purely around ML modesl. At this point most of the use cases for pure ML models are around entertainment. Thats not to say entertainment is not lucrative.

And like the author rightly points out there is a lot of competition in building these models. Should a company enter a market where there is a lot of competition?

In my opinion Google should not compete with companies / teams making ML models. Instead they should provide them a service that helps them design, train, test and run these models.

Run

All of these ML models require a lot of compute resources. Google can and should instead focus on providing an affordable method to run these models on GCP + TPU. Google is already investing a lot of resources on TPU and competing with Nvidia. In my opinion Google should be investing even more resources in providing a affordable way to run these models instead of competing with specific models

At this point Nvidia’s GPU are, by far, the best GPUs for running these models. Google can help Nvidia’ biggest competitor (AMD) by making changes to ROCm. By doing this they increase competition in the GPU market.

OpenAI’s Triton supports Nvidia’s GPU but doesn’t yet support AMD.

The resulting IR code is then simplified, optimized and automatically parallelized by our compiler backend, before being converted into high-quality LLVM-IR—and eventually PTX—for execution on recent NVIDIA GPUs. CPUs and AMD GPUs are not supported at the moment, but we welcome community contributions aimed at addressing this limitation.

Design

Google tried to do this by open sourcing tensorflow. And they continue to make significant improvements to tensorflow. But there is a lot of competition in this market too (pytorch, triton etc..). Google should make it easier to run all of these frameworks on their cloud rather than supporting their homegrown framework (tensorflow).

Tech, investment and a little bit of everything.

Discussion about this post