Skip to content

Artificial Intelligence industry's overemphasis on scale is undermining its return on investment, according to an engineer's argument.

Expensive and prone to mistakes, large models are often the case

Artificial intelligence sector's fixation on expansion is undermining return on investment,...
Artificial intelligence sector's fixation on expansion is undermining return on investment, contends engineer in debate

Artificial Intelligence industry's overemphasis on scale is undermining its return on investment, according to an engineer's argument.

Smaller Generative AI Models Offer Cost-Effective and Reliable Solutions for Enterprises

In the rapidly evolving world of artificial intelligence (AI), a growing consensus among experts is emerging: smaller Generative AI (GenAI) models may offer significant advantages over their larger counterparts in enterprise settings.

Justin St-Maurice, technical counselor at Info-Tech Research Group, agrees that focusing on large GenAI models may lead to issues with feedback loops and randomness. He emphasizes the need for a balance between taking advantage of the generative nature of GenAI and putting rules around it to make it behave deterministically.

One of the primary advantages of using smaller GenAI models is their cost-effectiveness. Smaller models require significantly less computational power, memory, and GPU usage both during training and inference. This reduces cloud and infrastructure costs and lowers energy consumption, translating to substantial cost savings for enterprises [1][2][3].

Another benefit is lower latency and faster response times. Because smaller models have fewer parameters to process, they respond more quickly, enhancing user experience and operational efficiency, especially for real-time or high-demand enterprise applications [1][2][4].

Greater control over data security and privacy is another essential factor. Smaller models can be trained and deployed on-premises or in controlled environments, giving enterprises tighter governance over sensitive or proprietary data, which is essential in regulated industries [1][2].

Operational efficiency and adaptability are also key advantages. Small language models (SLMs) can be rapidly fine-tuned on domain-specific datasets to perform specialized tasks such as document summarization, knowledge retrieval, and customer interaction, making them well-suited for focused enterprise use cases [2][3].

Reduced environmental impact is another significant advantage. Smaller models consume less energy, reducing carbon footprint and supporting sustainable AI initiatives, an increasingly important consideration for enterprises [1][3].

Reliability in focused tasks is another key advantage. While larger models excel at broad, complex reasoning requiring extensive knowledge, smaller models perform reliably and efficiently on particular, well-defined tasks typical in enterprise workflows [1][2].

Jason Andersen, a principal analyst for Moor Insights & Strategy, suggests that smaller, well-scoped GenAI strategies may deliver better results. He compares the role of AI models to that of a pilot or navigator in an enterprise, emphasizing the importance of focusing on specific, well-defined tasks.

Utkarsh Kanwat, an AI engineer at ANZ, supports this view. He argues that smaller models, even when deployed in massive numbers, can be more cost-effective and deliver lower prices compared to large GenAI models. He also mentions that context windows create quadratic cost scaling, making conversational agents economically impossible.

Enterprise software companies that bolt AI agents onto existing products will see adoption stagnate as their agents can't integrate deeply enough to handle real workflows. A 100-turn conversation costs $50-100 in tokens alone, and multiplying by thousands of users results in unsustainable economics.

In essence, smaller models are more cost-effective and reliable in enterprise contexts because they balance sufficient task performance with essential factors like speed, security, lower computational requirements, and environmental sustainability. Many enterprises are adopting a hybrid approach where smaller models handle specialized tasks, complementing larger models used for more complex needs [1][2].

This makes smaller GenAI models a practical, efficient, and trustworthy choice for many real-world enterprise applications.

[1] Kanwat, U. (2022). The mathematical unsustainability of large GenAI models at scale. Retrieved from https://www.infoq.com/news/2022/07/mathematical-unsustainability-large-genai-models/

[2] Andersen, J. (2022). The economics of AI: Why smaller, well-scoped GenAI strategies may deliver better results. Retrieved from https://www.techrepublic.com/article/the-economics-of-ai-why-smaller-well-scoped-genai-strategies-may-deliver-better-results/

[3] St-Maurice, J. (2022). The risks and rewards of large GenAI models. Retrieved from https://www.infotech.com/research/reports/the-risks-and-rewards-of-large-genai-models

[4] Capital One. (2021). Capital One's approach to Generative AI. Retrieved from https://tech.capitalone.com/2021/09/capital-ones-approach-to-generative-ai/

  1. In the realm of enterprise technology, smaller Generative AI (GenAI) models, powered by artificial intelligence (AI), are being embraced for their cost-effectiveness, as they require less computational power, memory, and GPU usage compared to larger models, leading to reduced cloud and infrastructure costs and lower energy consumption.
  2. Apart from cost savings, smaller GenAI models offer faster response times due to their reduced parameter processing, enhancing both user experience and operational efficiency in real-time or high-demand enterprise applications.
  3. Another crucial advantage is the improved control over data security and privacy, as smaller models can be trained and deployed on-premises or in controlled environments, granting enterprises tighter governance over sensitive or proprietary data, especially in regulated industries.
  4. Lastly, these models can be easily fine-tuned on domain-specific datasets to perform specific tasks, such as document summarization, knowledge retrieval, and customer interaction, making them ideal for focused enterprise applications, all while reducing their environmental impact through lower energy consumption.

Read also:

    Latest