Top

Counterpoint Research Joins DTW24 Ignite as Official Media Partner

Counterpoint Research is pleased to announce its participation as a Media Partner in DTW24 Ignite.

When: 18th – 20th June, 2024

Where: Copenhagen


About the Event:

DTW24-Ignite will return to Copenhagen from June 18-20, offering a comprehensive exploration into future-proofing one’s AI-native journey from design to delivery. This event will provide invaluable insights into harnessing the transformative power of AI for innovation and leveraging the capabilities of composable IT. Attendees will have the opportunity to learn how to enhance intelligent networks and implement AI and automation at scale, while also exploring strategic initiatives and adjacent opportunities that unlock new revenue streams.

Read more about DTW24 Ignite

Related Posts

GPT-4o: OpenAI’s New Frontier in User Experience

OpenAI marked a significant leap forward with its much-anticipated spring update – not by launching a new model like GPT-5 but by introducing GPT-4o, a cutting-edge model that integrates audio, visual and text processing in real time. GPT-4o (“o” for omni) is all about enhancing user experience, and it comes packed with new features and improvements that are set to revolutionize human-machine interaction. Here are some key highlights from OpenAI’s announcement:

  • Real-time Multimodal Integration: GPT-4o combines audio, visual and text processing, enabling it to interact with users more naturally and intuitively. In a way, GPT-4o integrates three models – text, vision and audio.
  • Free Access with Improved Speed: OpenAI claims GPT-4o is 2x faster than GPT-4. Users can enjoy the intelligence of GPT-4 with even faster performance, all at no cost.
  • Enhanced Memory and Analytics: The addition of memory and advanced analytics allows for more sophisticated and personalized interactions. GPT-4o can interpret complex visuals like charts and memes alongside text inputs. Files can be directly uploaded from Google Drive and Microsoft One Drive.
  • Multilingual Support: Available in 50 languages, GPT-4o caters to a global audience, breaking down language barriers.
  • Developer-Friendly APIs: Developers can leverage GPT-4o’s capabilities through newly available APIs, fostering innovation across applications.
  • User-centric Design: The new interface emphasizes a highly integrated and intuitive user experience.
  • Desktop App: OpenAI will also release a desktop application in addition to the mobile application to cater to a wider range of user needs.
  • Pricing: GPT-4o’s API pricing is half that of GPT-4 Turbo. In GPT-4o input cost $5 per million tokens while output costs $15 per million tokens. Considering that GPT-4o’s token throughput (tokens per second) is almost 3x that of GPT-4 Turbo, the value proposition is much better for GPT-4o.
Image Source: AI Supremacy

Implications of GPT-4o

Improved Human-Machine Interaction

During the model demonstration, GPT-4o showcased its ability to create more natural conversations. It can generate voice responses in various emotive styles and adjust its answers in real time, even when interrupted or given additional information. This adaptability is a game-changer for human-machine interaction, positioning OpenAI at the forefront of this rapidly evolving field.

OpenAI’s investment in humanoid companies like Figure hints at the broader applications of GPT-4o. The advanced capabilities of this model could significantly enhance the functionality of humanoid robots, making interactions with these machines more fluid and human-like. Additionally, AI devices like wearables and smartphones stand to benefit immensely from GPT-4o’s real-time processing and contextual understanding.

Transforming Customer Service and Virtual Assistants

With its improved contextual understanding and ability to handle complex tasks, GPT-4o is poised to revolutionize customer service and virtual assistants. Its quick, accurate and context-aware responses could enhance user satisfaction and efficiency in these domains, setting new standards for AI-driven interactions. Siri looks outdated when compared to the GPT-4o voice assistant and it would be interesting to see how GPT-4o gets integrated with devices to be able to search and answer based on on-device files.

Advancing Language Translation

GPT-4o’s multilingual capabilities are particularly impressive. During the demonstration, the model translated from English to Italian almost instantaneously, showcasing its potential to improve language translation services. This feature can facilitate more accurate and context-aware translations, bridging communication gaps across different languages.

Personalized Learning Experiences

In education, GPT-4o could offer more personalized and effective learning experiences by adapting content to individual learners’ needs and preferences. For instance, the model’s ability to assist with solving mathematics problems step by step, though seemingly basic, holds the potential to transform educational practices by providing tailored support to students. Schools and colleges are geared towards one-to-many interactions leaving some of the learners behind. GPT-4o as a personal tutor can help students get one-on-one support. However, it remains to be seen how efficient and effective the model is in solving complex problems.

Concerns on Potential Misuse

There are ethical considerations and societal implications in developing human-like AI technologies as they are next step to AGI. The new models can be misused by creating a potentially manipulative AI companion. The model’s ability to process audio and visual inputs could be used to generate highly realistic but fabricated content, such as deepfake videos or synthetic voices, which can be difficult to distinguish from authentic content.

First Impressions

Counterpoint’s team tested GPT-4o on the mobile application as well as on browser and the model’s analytical prowess proved to be remarkable. The team uploaded a stock chart for analysis and shared the results with a seasoned stock technical expert who was thoroughly impressed by GPT-4o’s remarkable output.

Image Source: Mohit Agrawal, Counterpoint Research

In another test, we provided the model with a stock report for ABN AMRO and requested a summary. Remarkably, not only did GPT-4o summarize the report accurately, but it also responded with precision to pointed questions derived from the document. Some inquiries even required the model to interpret charts within the report, which it delivered accurately and without hesitation.

However, the mobile application’s audio experience fell short of expectations. High latency detracted from the smoothness anticipated from OpenAI’s demo event. Despite significant lag in translating from English to Italian, the quality of translation remained exceptional, demonstrating the model’s linguistic prowess.

On the downside, the free version of the application often ran out of credits, hindering file uploads and leading to downgrades to GPT-3.5. However, there was a silver lining in the form of more frequent limit resets, which increased from every 12 hours to every 5 hours. We expect limits to increase substantially as capacity constraints are addressed – a familiar hurdle faced by OpenAI during its initial launch.

Conclusion

OpenAI’s focus with GPT-4o is clear ­– enhancing user experience. By prioritizing the integration of advanced features and a user-friendly interface, OpenAI aims to maintain its competitive edge. The commitment to improving human-machine interaction highlights the company’s strategic direction in the AI landscape.

GPT-4o represents a significant advancement in AI technology, not through the introduction of a new model, but by fundamentally improving how users interact with AI. Its real-time multimodal integration, enhanced features and focus on user experience make it a pivotal development in the AI field. As OpenAI continues to innovate, GPT-4o stands as a testament to the company’s dedication to leading the future of human-machine interaction.

Related Posts

GenAI in Retail: Opportunities Galore, Challenges Remain

  • For retailers to fully embrace the opportunity of advanced technologies like GenAI, solution enablers need to find ways to remove obstacles holding back implementation.
  • There are opportunities left untouched right now to make use of the existing device capabilities and push user engagement further. This is because of long-held beliefs as to how users prefer to engage with their devices and content.
  • There have been instances of technology rollouts being reversed, confirming the need for a careful evaluation of available solutions.

The Location Based Marketing Association (LBMA) recently held its annual RetailLoco conference in Minneapolis, Minnesota. The event brings together experts and stakeholders from across the retail technology ecosystem. This year’s edition showcased GenAI brand activations while underscoring the persistent challenges retailers face in integrating advanced smartphone features and AI across their physical and online shopping experiences.

Opportunities to grow for all

While a cautious approach to AI and GenAI was a theme throughout the event, there were a few interesting implementations highlighted:

  • Burger King Brazil ran a promotion where menu combinations were presented to mobile app users once they scanned their facial expressions.
  • L’Oréal’s Perso at-home assistant integrates Breezometer weather and location information to provide personalized makeup suggestions.
Example of L'Oreal AI assistant.
Image source: Gerrit Schneemann, Counterpoint Research

These activations show the potential of GenAI capabilities to engage customers in new ways. At scale, these types of examples can lead to personalized shopping or dining experiences, previously impossible to deliver.

For retailers to fully embrace the opportunity of advanced technologies like GenAI, AR/VR and network-based positioning, solution enablers need to meet them halfway and find ways to remove most of the obstacles currently holding back implementation:

  • Retailers must prioritize the seamless integration of mobile payments and user-centric experiences.
  • AI-driven solutions possess the capability to enhance operational efficiency, personalize interactions and streamline business processes, be it predictive maintenance in manufacturing or AR-enhanced marketing endeavors.
  • Technology providers must recognize that a one-size-fits-all approach to the deployment of new technologies will not work, and the availability of customizable solutions will be critical for adoption.
  • End users are likely to embrace new technologies if there is a clear benefit for them. This includes sharing location information, downloading specific apps, and changing device use patterns to take advantage of new features.

Fragmentation of backend systems overshadows new technologies’ potential utility 

One of the key themes of the presentations by representatives from companies like Harley Davidson, Glympse and Kroger was that technology availability does not necessarily translate to a clear strategy on how to implement these solutions across mobile apps, online, and retail locations.

A striking example of the lack of utilization of existing features on smartphones is augmented reality (AR) for discovery and guidance within stores. The focus of navigation remains on A-to-B guidance to a store with heavy stress on displaying the famous blue dot on the map, with pedestrian guidance and AR content integration appearing out of favor at this point.

There are opportunities left untouched right now to make use of the existing device capabilities and push user engagement further. This is because of long-held beliefs as to how users prefer to engage with their devices and content. One such point of contention is how users would engage with AR content, and the need to hold up a phone to view it. However, this should be less of an issue now as video content creation across social media is standard behavior for many. Gone are the days when phones were neatly tucked away while on the go.

Another challenge to the full embrace of advanced mobile features appears to be the long tail of proprietary internal systems. Retailers feel inadequately prepared to navigate technologies such as AI, chatbots, AR and VR. Constraints stemming from budgetary limitations, dearth of internal resources and lack of executive endorsement impede their adoption efforts further. While mobile payments, AI and chatbots hold immense promise, retailers grapple with the complexities of implementation.

Kroger feedback loop.
Image source: Gerrit Schneemann, Counterpoint Research

Importantly, there have been instances of technology rollouts being reversed, confirming the need for a careful evaluation of available solutions. Walmart’s reversal on self-checkout stations and findings that Amazon’s touchless shopping experience in physical stores was powered by a large workforce decoding video streams instead of AI are key examples that are likely to cause large organizations to pause and re-evaluate their plans and adjust accordingly.

RetailLoco 2024 served as an interesting snapshot of the status quo and future of retail, where AI and innovation are significant opportunities for all involved in the value chain. However, there are a host of obstacles in the way of faster adoption of the most advanced solutions, driven by an undeniable wariness to over-committing to potentially flawed technologies and the realities of proprietary internal systems, ill-equipped to deal with fast-changing 5G and AI solutions.

Related Posts

Meet Counterpoint at Computex Taipei 2024 Event

Counterpoint will be attending Computex Taipei 2024 Event from June 4 – June 7

Our team of directors and analysts will be attending Computex Taipei Event, 2024. You can schedule a meeting with them to discuss the latest trends in the technology, media and telecommunications sector and understand how our leading research and services can help your business.

Here is the list of team members attending the event:

When: June 4– June 7 | 9:30 Am – 5:30 PM

Where: Taipei Nangang Exhibition Center, Hall 1 (TaiNEX 1) | Taipei Nangang Exhibition Center, Hall 2 (TaiNEX 2)

About the event:

COMPUTEX has grown, transformed with the industry, and established its reputation as the world’s leading platform. The expo will continue with the position of “connecting AI”, featuring the latest tech trends: AI Computing, Advanced Connectivity, Future Mobility, Immersive Reality, Sustainability, and Innovations.

Click here (or mail to Counterpoint Taiwan PR Shirley.cheng@counterpointresearch.com) to schedule a meeting with them. 

Read more about the Computex Taipei 2024 event.

Related Posts

Huawei Expands its AI Ambition With Pangu Large Models

  • Huawei has underlined its pivotal new objective for 2024 – to capitalize on AI’s strategic prospects and enhance intelligence from networks to industries.
  • Huawei not only aims to dedicate resources towards fundamental AI research to foster continuous innovation but also aims to actively participate in the formation of global AI policies.
  • During the summit, Huawei revealed the current progress in training Pangu models with an impressive 230 billion parameters.

Huawei conducted its annual Huawei Analyst Summit (HAS) in Shenzhen, China, between April 17 and April 20. A team of analysts from Counterpoint attended this 21st edition of HAS to get updates on the company’s progress, vision, and strategy for 2024 and beyond. We utilized the opportunity to also enquire about Huawei’s AI strategy during the keynote’s Q&A segment with Huawei Deputy Chairman and Rotating Chairman Eric Xu.

It was clear from Xu’s statements and the keynote that this year’s HAS marked a leap forward for Huawei from previous events. The company underlined its pivotal new objective for 2024 – to capitalize on AI’s strategic prospects and enhance intelligence from networks to industries. Huawei intends to harness AI to boost the appeal and performance of its products and services, optimize internal operations, and save as well as make more money by improving business and operational performance across industries. Huawei not only aims to dedicate resources towards fundamental AI research to foster continuous innovation but also aims to actively participate in the formation of global AI policies.

Pangu models take the spotlight

Pangu models, introduced by Huawei in 2021 as the world’s largest pre-trained Chinese large language models (LLMs) with over 100 billion parameters, are now advancing into their fourth iteration. During the summit, Huawei revealed the current progress in training Pangu models with an impressive 230 billion parameters.

Pangu for industry

Huawei is helping industries with its ready-to-call AI models in the form of Ascend AI-as-a-Service models. Also, Huawei has already successfully implemented its Pangu large models in industry-specific applications to drive value creation and solve major challenges. Notable deployments driving intelligent R&D and manufacturing include:

  • AI-assisted coding copilot that enhances R&D efficiency by 50%.
  • AI-augmented visual inspection systems in manufacturing that attain an accuracy rate of 99%.
  • Pangu Mining Model intelligently analyses the quality of stress relief drilling and assists rock-burst prevention personnel in quality verification in a coal mine.
  • It helps reduce manual review workload by 82% and delivers a 100% acceptance rate for rock burst prevention engineering work.

Pangu for science

Huawei also highlighted Pangu’s advanced simulation technologies for science:

  • Pangu-Weather brings a revolutionary increase in the speed of weather forecasting. It achieves a simulation velocity that is 10,000 times faster than current standards. Such a leap could drastically improve the precision and reliability of meteorological predictions, benefiting everything from agriculture to disaster preparedness.
  • Pangu Fluid focuses on fluid dynamics simulations, with speeds exceeding current capabilities by 20 times or more. Enhanced simulation speeds can be crucial for a range of applications, including aerodynamics, climate research and engineering. 
  • Pangu Drug points toward a breakthrough in drug discovery with the generation of 100 million new molecules and a tenfold increase in the efficiency of drug design processes. Notably, it also claims a 50% increase in the success rate, potentially accelerating the pace of pharmaceutical innovation and therapeutic breakthroughs.

Next move: Pangu for device, Celia as super AI agent

Amid the surge of LLM chatbots like ChatGPT, Huawei is set to transform its smart assistant Celia into an advanced AI agent. This upgrade will be powered by its Pangu foundation models at the device level.

Celia will be equipped with capabilities for perceiving user intent, delivering anticipatory services, summarizing content, conducting intelligent image searches, and providing a superior AI experience in varied contexts such as work, fitness, entertainment, travel, and home settings.

Key takeaways

  • Huawei’s Pangu models distinctively target industrial applications, optimizing operations, product R&D and software engineering with remarkable precision and speed, setting them apart from broader-focus models like Baidu’s Ernie Bot and Alibaba’s Tongyi Qianwen. However, this might limit their applicability in general-purpose generative AI models, where applications such as OpenAI’s ChatGPT excel.
  • Given Huawei’s well-established ecosystem, the success of Pangu models in industrial settings heavily relies on collaborations with its partners. This could restrain its ability to operate and scale independently.
  • Despite lagging behind SenseTime and Baidu in LLM and multimodal generative AI, Huawei is playing catchup, injecting more resources into fundamental AI research, especially in the areas of AI agent and world model building, the core technologies for realizing on-device chatbot and text-to-video applications.
  • Challenges remain in getting the required computing power for training larger AI models, as Huawei’s in-house Ascend 910B processors, while capable, still fall short of the superior performance levels offered by NVIDIA’s latest chips.
  • In summary, the 2024 edition of HAS unveiled Huawei’s aggressive AI strategy, marking a strategic pivot to capitalize on the capabilities of its Pangu foundation models, with significant ramifications across diverse industries.

Related Posts

Apple Could be Looking for iPhone GenAI Partnerships in China and Globally

  • Rumors of Apple partnering with external GenAI providers highlight potential nuanced AI strategy to meet the needs of disparate markets.
  • Regulatory compliance, local needs are drivers for possible Baidu tie-up in China.
  • Apple’s global GenAI strategy – the how and why of in-house vs Gemini.

Generative Artificial Intelligence (GenAI) has emerged as a focal point of innovation within the smartphone industry, captivating the attention of Original Equipment Manufacturers (OEMs) worldwide. Companies such as Samsung, HONOR, and Google have made significant strides by introducing smartphones powered by GenAI technology, reshaping the landscape of mobile devices. Notably, within the Android ecosystem, GenAI integration has taken center stage, positioning it as a frontrunner in leveraging AI capabilities.

In contrast, Apple has been perceived as lagging in this domain, prompting recent discussions of potential collaborations with tech giants like Google and Baidu to integrate their advanced GenAI solutions into the Apple ecosystem. Anticipation is building as rumors suggest that Apple may unveil details of these strategic partnerships at the upcoming Worldwide Developers Conference (WWDC) scheduled for June, signaling a potential shift in the dynamics of AI adoption in the smartphone market.

Bloomberg and the New York Times recently published articles about Apple’s potential deal with Google. Chinese media also carried news about Apple being in talks with Baidu to introduce GenAI technologies in the iPhone.

Here is our take on the contours of Apple’s potential partnership with Baidu and Google:

Apple With Baidu, All for Compliance?

Regulatory Compliance 

In the dynamic landscape of Chinese technology regulations, Apple’s pursuit of integrating GenAI into its ecosystem necessitates collaboration with a local partner. Additionally, a tie-up with a Chinese partner will also enable Apple to bring local knowledge to its offering.

The stringent regulatory framework in China – in particular, those around the collection of sensitive public data – makes it highly improbable for AI models developed by Western entities to receive the necessary approvals. AI models of Chinese origin are more favored. This regulatory environment compels Apple to forge new alliances within the country, with Baidu emerging as a potential partner.

Localized AI Solutions

China’s top Internet search engine provider Baidu’s GenAI model is one of 40 models approved by Chinese regulators, with the most famous being the Ernie Chatbot.

Baidu claims its latest version, the Ernie Bot 4.0, outperforms GPT-4 in Chinese, leveraging the world’s largest Chinese language corpus for training.

It excels in grasping Chinese linguistic subtleties, traditions, and history, and can compose acrostic poems, areas that ChatGPT may struggle with.

Although nothing has been officially confirmed, should negotiations between Apple and Baidu prove fruitful, Baidu’s progress in LLM on the mobile device would see strong acceleration, particularly in system optimization, given that about one in five Chinese smartphone users own an iPhone.

Apple might also consider partnering with other GenAI providers in China, like Zhipu AI and its prominent GLM-130B model, or Moonshot AI with its Kimi Chatbot, notable for processing up to 2 million Chinese characters per prompt.

Despite Apple’s choice, embedding a Chinese LLM model into the forthcoming iOS 18 seems improbable. Apple will not let any third-party model weaken its influence on the iOS ecosystem. Consequently, Apple might seek a more collaborative partnership with Baidu with inferencing largely restricted to the cloud. This evolving partnership will showcase Apple’s flexibility and strategic maneuvering within China’s regulatory framework, marking a new era in its AI ventures in the Chinese market.

Apple and Google, Friend or Foe?

Apple is also rumored to be considering a global collaboration with Google to integrate the Gemini AI model on iPhones.

Catching up in AI

Gemini is a suite of large language models developed by Google, renowned for its robust capabilities and extensive applications. While Apple is actively developing its own AI models, including a large language model codenamed Ajax and a basic chatbot named Apple GPT, its technology is unproven.

This makes collaboration with Google a more prudent choice for now. The potential use of Google’s Gemini AI would underscore an effort from Apple to incorporate sophisticated AI functionalities into its devices, beyond its in-house capabilities​​.

External AI integration: While Apple may use its own models for some on-device features in the upcoming iOS 18 update, partnering with an external provider like Google for GenAI use cases (such as image creation and writing assistance) could indicate a significant shift towards enhancing user experiences with third-party AI technologies​​.

Apple might follow the strategy of Chinese OEMs like OPPO, vivo, HONOR and Xiaomi who all have in-house LLMs that have been deployed into their latest flagship models. Meanwhile, Apple could work with third parties like Baichuan, Baidu and Alibaba to leverage a broader GenAI ecosystem. With this strategy, Chinese OEMs have full control of the local big model and resulting changes in hardware and software closely relating to smartphone design, and they can continue to make a profit by simultaneously owning an App store and AI agent. If Apple were to follow this strategy for international markets, it may seek alliances not only with Google but also with OpenAI and Anthropic. This approach would mean that Apple will not have to open iOS to third-party LLMs.

If Apple were indeed to incorporate Gemini into its new operating system to offer users an enhanced, more intelligent experience, these advancements would primarily be focused on-device, rather than on cloud-based features. This means Google’s Gemini Nano, the most lightweight LLM, is a preferred choice, as it has also been optimized and embedded in Google’s Pixel 8 Smartphone.

Implications of Apple’s Potential LLM Partnerships

In China, a potential partnership with Baidu would ensure Apple’s offerings are both culturally attuned and compliant with local regulations, while the global collaboration with Google promises to inject a new level of intelligence into the iPhone’s capabilities, thus redefining the essence of mobile user experience for Apple.

These types of partnerships would highlight a nuanced approach to AI integration, tailored to meet the distinct needs of different markets.

Related Posts

Counterpoint Conversations: Humane’s Ai-Pin Reimagines Computing with an AI-Powered OS

AI was the buzzword at almost every other booth at MWC 2024. One product that caught everyone’s attention was the Ai Pin by Humane, a Silicon Valley startup founded by ex-Apple executives. Moving beyond smartphones and touchscreens, this tiny wearable device offers a new way of computing with gestures and just your voice. We got to see a quick demo of the Humane Ai Pin at work. Ritesh Bendre, Global Content Manager at Counterpoint Research, met with Imran Chaudhri, Co-Founder and CEO of Humane, to talk about the revolutionary AI product that aims to redefine how we interact with technology.

The Interview

Key Takeaways from the Discussion:

Motivations behind Humane’s Ai Pin:

• Smartphones have reached the peak of a 15-year cycle.
• Innovation has plateaued.
• Need for a new form factor with intelligence and agility.
• Freeing users from the limitations of touchscreens and keyboards.
• Not a replacement for existing smartphones and PCs, but a companion device.

New ways to interact with voice and laser:

• Ai Pin boasts a screenless design.
• It uses voice commands for basic interactions.
• Also features a laser projection display that appears only when needed.
• Includes a camera that records short videos, captures photos with AI assistance, and more.

AI at the core:

• Features a custom-built AI operating system (AI-OS) created from the ground up.
• AI integration aims to automate many user tasks.
• Allows for a more seamless and effortless computing experience.

A glimpse into the future:

• Chaudhri believes AI will play a central role in future Humane products.
• AI will also play a crucial role across various sectors, including health, education, and information access.
• Automation through AI will improve user experience and interaction with future computing devices.

Analyst Takeaways:

  • The Humane Ai Pin gives us an early glimpse at what a phone-free future could look like.
  • The Ai Pin can respond to queries, make and receive calls, click photos, and record videos, thus doing more than what an AI assistant-powered smart speaker can do.
  • However, the device’s high initial cost of $699, followed by a monthly subscription of $24, and availability only in the US could hinder its mainstream adoption.

Related Posts

Claude 3 Dethrones GPT-4 to Mark Phase Two in LLM Competition

Anthropic, one of OpenAI’s top competitors, introduced its latest Large Language Model (LLM), called Claude 3, in early March. The AI community was taken by surprise as Claude 3’s capabilities proved to be superior to that of OpenAI’s flagship GPT-4, marking the first instance of GPT-4 being outperformed. Meanwhile, Google’s Gemini Ultra trailed behind both.

The launch of Claude 3 ushered in what appears to be the second phase of LLM competition, where companies prioritize in-context understanding, robustness and reasoning over mere scale. The generative AI sector has recently been accelerating rapidly on the back of contributions from key players including OpenAI, Anthropic, Google, Meta and Mistral AI.

The first phase of the LLM competition was set in motion following the debut of OpenAI’s ChatGPT in late 2022. This phase was characterized by a race to scale, with companies vying to develop increasingly powerful models primarily focused on size and computational capabilities.

OpenAI’s GPT-4 once epitomized the zenith of these efforts, setting benchmarks for what generative AI could achieve in terms of understanding and generating human-like text. Many subsequent LLMs, including Google’s Gemini series, Anthropic’s Claude 2, Meta’s Llama series and Mistral AI’s Mistral Large, continued to challenge the dominance of GPT-4, yet failed.

However, the ascendancy of Anthropic’s Claude 3 signifies a paradigm shift to a new era. Now the battlefield has become multi-polarized.

Phase Two Begins

We think GPT-4 being surpassed by Claude 3 marks the second stage of LLM contests:

  • The Claude 3 family showcases three cutting-edge models, named Claude 3 Haiku, Claude 3 Sonnet, and Claude 3 Opus, arranged by their growing capabilities. Claude 3 Opus is superior to GPT-4 in all key performance benchmarks.
A chart comparing Claude with GPT and Gemini across various parameters
Source: Anthropic website: Claude 3 announcement
  • Claude 3 has an unprecedented level of understanding in advanced science. For example, Kevin Fischer, a theoretical quantum physicist, was astounded by Claude 3’s grasp of his doctoral thesis.

A snip of Kevin Fischer's Tweet on Claude

A screen snip of Guillaume Verdon's Tweet on Claude

  • Claude 3 not only comprehends complex scientific principles but also exhibits a degree of emergent capability. For example, another expert in quantum computing was taken aback when Claude 3 reinvented his algorithm with just two prompts, without seeing his yet-to-be-published paper.

A screen snip of Guillaume Verdon's Tweet on Claude

  • The degree of Claude 3’s “meta-awareness” (can be just superb pattern-matching alignments with data created by humans) lets it figure out that it is being tested in a simulation in the needle-in-the-haystack evaluation. This testing method, just like “finding a needle in a haystack,” is designed to ascertain whether LLMs can accurately pinpoint key facts within hundreds of thousands of words. Initially invented by Greg Kamradt, a member of the open-source community, this approach quickly gained traction among major AI companies. Giants like Google, Mistral AI, and Anthropic now commonly showcase their new models’ performance through these tests.

A screen snip of Alex Albert's Tweet on Claude

Claude 3 Opus Recall accuracy over 200K

Claude 3 Opus Observations

What does it mean to be in the Stage Two of LLM competition?

Linear vs Accelerated Progress

We have observed that currently there is an accelerated rate of innovation progress in the LLM battlefield. Even though it is only March, a host of contenders, such as Google’s Gemini Ultra and Mistral AI’s Mistral Large, have already attempted to take the throne from OpenAI’s GPT-4. However, it was Anthropic’s Claude 3 Opus that emerged on top, marking a pivotal breakthrough in the ongoing quest for supremacy.

Open vs Close

The rivalry within the realm of closed-source LLMs has escalated, positioning closed-source generative AI technologies as a pivotal tactic for forging any company’s defensive “moat”.

For instance, Mistral AI initially captured attention with its impressive open-source Mixture of Experts (MoE) lean models but has now pivoted to spotlight its proprietary Mistral Large model.

Advice for Developers

In the ever-changing LLM landscape, developers need to understand that given your specific use case, making assessments that truly gauge a model’s strengths and weaknesses becomes more important than blindly trusting the general benchmarks:

  • Stay agile, ready to integrate newer models or versions as they become available. Today’s choice might need reassessment tomorrow.
  • A blend of understanding each model’s unique strengths, continuous exploration and adaptation cannot be more emphasized, given the specific needs of your applications.
  • Much like the varied tactics of donning armor for battle, adapting your prompts is crucial to maximizing a model’s potential. Comprehensive tutorials are readily available online to guide you.

Guest Post: AI Ecosystem – The Race Begins

While everyone is going to be focused on AI inference in 2024, the real action is going to be in the ecosystem where new players and old will slug it out to ensure that users and developers use their models.

Foundation model is to AI what OS is to smartphone

  • The players are all working to get their ducks in a row to be ready to launch two things in 2024:
    • First, software development kit (SDK): This is a piece of software or an interface that makes it easy for developers to retrain a foundation model in order to perform a certain service.
    • This is the AI equivalent of the SDKs that are used to write apps for iOS, Android, Windows and so on.
    • It turns out that the foundation model is becoming a control point in the AI ecosystem as they are difficult and expensive to create and difficult to change once one has based a service on one particular model.
    • Hence, it looks like the foundation model will have the same strategic importance as the operating system in consumer devices like smartphones and tablets.
    • Second, AI store: This is the equivalent of an app store on iOS or Android and provides a marketplace for developers to sell the services that they create on the foundation model of the store’s owner.
  • This is a precise repeat of the strategy that we saw in 2008 from Apple and in 2012 from Google. It allowed the smartphone ecosystem to grow into the behemoth that it is today.
  • It is also how the owners of the foundation models intend to grow and monetize the AI ecosystem. If a platform can become the go-to place to create, buy or sell a service then a lot of money can be made.
  • This is why during 2024 we will see a lot of launches of both SDKs and stores as the contenders for the AI ecosystem begin jostling for position.
  • The outcome of this battle will define who wins and who loses in the AI ecosystem. And there are vast amounts of money at stake, given how useful generative AI can be.
  • The early leader is OpenAI which seemed to have the race sown up but immediately after launching its SDK and GPT Store, a total self-immolation which is far from resolved has opened the door for everyone else.
  • Many developers who have already committed to using GPT are now much less certain about their choice. The corporate instability raises questions about the long-term viability of GPT as a development platform.
  • Most of the other players are simple, for-profit companies, which means that committing to use Gemini from Google as the foundation is immediately less risky.
  • Furthermore, I remain far from convinced that OpenAI is massively better than anyone else. It did come to market first but appears to have subsequently lost its lead.
  • Hence, I think that the AI ecosystem remains wide open and just like the smartphone ecosystem, I suspect that there will be 3 to 5 large players who take most of the market with a sprinkling of smaller niche players around the edge.
  • It is this battle to be one of the 3 to 5 that is likely to commence this year and we will see this played out at developer conferences and events where the AI ecosystem tools will be launched.
  • OpenAI’s competitors are much more attractive than they were a few months ago as their governance structures are much less flawed.
  • Hence, as a developer, I would have far more confidence that Google and Meta will not blow up in the same way that Open AI did, although they still make plenty of silly mistakes just like everyone else.
  • 2023 was the year of training but I think this will begin to give way to inference in 2024 as algorithms begin to be deployed and the battle for the AI ecosystem heats up.

(This guest post was written by Richard Windsor, our Research Director at Large. This is a version of a blog that first appeared on Radio Free Mobile. All views expressed are Richard’s own.)

Term of Use and Privacy Policy

Counterpoint Technology Market Research Limited

Registration

In order to access Counterpoint Technology Market Research Limited (Company or We hereafter) Web sites, you may be asked to complete a registration form. You are required to provide contact information which is used to enhance the user experience and determine whether you are a paid subscriber or not.
Personal Information When you register on we ask you for personal information. We use this information to provide you with the best advice and highest-quality service as well as with offers that we think are relevant to you. We may also contact you regarding a Web site problem or other customer service-related issues. We do not sell, share or rent personal information about you collected on Company Web sites.

How to unsubscribe and Termination

You may request to terminate your account or unsubscribe to any email subscriptions or mailing lists at any time. In accessing and using this Website, User agrees to comply with all applicable laws and agrees not to take any action that would compromise the security or viability of this Website. The Company may terminate User’s access to this Website at any time for any reason. The terms hereunder regarding Accuracy of Information and Third Party Rights shall survive termination.

Website Content and Copyright

This Website is the property of Counterpoint and is protected by international copyright law and conventions. We grant users the right to access and use the Website, so long as such use is for internal information purposes, and User does not alter, copy, disseminate, redistribute or republish any content or feature of this Website. User acknowledges that access to and use of this Website is subject to these TERMS OF USE and any expanded access or use must be approved in writing by the Company.
– Passwords are for user’s individual use
– Passwords may not be shared with others
– Users may not store documents in shared folders.
– Users may not redistribute documents to non-users unless otherwise stated in their contract terms.

Changes or Updates to the Website

The Company reserves the right to change, update or discontinue any aspect of this Website at any time without notice. Your continued use of the Website after any such change constitutes your agreement to these TERMS OF USE, as modified.
Accuracy of Information: While the information contained on this Website has been obtained from sources believed to be reliable, We disclaims all warranties as to the accuracy, completeness or adequacy of such information. User assumes sole responsibility for the use it makes of this Website to achieve his/her intended results.

Third Party Links: This Website may contain links to other third party websites, which are provided as additional resources for the convenience of Users. We do not endorse, sponsor or accept any responsibility for these third party websites, User agrees to direct any concerns relating to these third party websites to the relevant website administrator.

Cookies and Tracking

We may monitor how you use our Web sites. It is used solely for purposes of enabling us to provide you with a personalized Web site experience.
This data may also be used in the aggregate, to identify appropriate product offerings and subscription plans.
Cookies may be set in order to identify you and determine your access privileges. Cookies are simply identifiers. You have the ability to delete cookie files from your hard disk drive.