OpenAI Holds 1st DevDay

OpenAI held its highly anticipated DevDay event today, Nov 6th, showcasing an array of new capabilities, upgrades, and pricing changes for its developer platform. With a strong emphasis on power, accessibility, and customization, OpenAI aimed to empower developers to build the next generation of AI applications. Here is a summary of the key announcements from OpenAI DevDay 2023:

GPT-4 Turbo

OpenAI kicked off DevDay 2023 by unveiling GPT-4 Turbo, the next iteration of its language model. GPT-4 Turbo brings greatly enhanced capabilities to the table, boasting a 128K context window, enabling prompts equivalent to over 300 pages of text. This enhancement allows developers to leverage a wider range of information, resulting in more contextually aware and nuanced AI-powered applications. In addition to its expanded context window, GPT-4 Turbo offers improved performance at a fraction of the cost, with a 3x reduction in input token pricing and a 2x reduction in output token pricing compared to its predecessor, GPT-4. With GPT-4 Turbo the knowledge cutoff is now April 2023. The UI for GPT-4 Turbo is simplified, removing the model picker in lieu of the capability to automatically sense and choose the proper model for the prompt. The new LLM version will be available to developers and, when stable rolled out to paying customers, probably in a few weeks.

The Assistants API

Another major highlight of OpenAI DevDay 2023 was the introduction of the Assistants API, designed to facilitate the creation of agent-like experiences within applications. The Assistants API opens up a world of possibilities, allowing developers to build AI-powered assistants with specific instructions, personalized knowledge, and the ability to call models and tools to perform tasks. This powerful API provides developers with a suite of advanced capabilities, including code execution, knowledge retrieval, and function calling. By offloading heavy lifting tasks to the Assistants API, developers can focus on creating high-quality AI applications that deliver exceptional user experiences.

Multimodal Capabilities

OpenAI continues to expand multimodal capabilities, enabling AI models to process and generate content across different modalities. With GPT-4 Turbo, developers can now harness the power of vision, as the model can accept images as inputs in the Chat Completions API. This feature opens up new possibilities, such as generating captions, analyzing real-world images in detail, and reading documents with figures. Additionally, OpenAI introduced DALL·E 3 integration, allowing developers to programmatically generate images and designs through the Images API. With these multimodal capabilities, AI applications can now seamlessly incorporate vision-based tasks, enhancing their overall functionality and user experience.

In addition to the vision-based modalities, there is a new text-to-speech (TTS) modality that enables developers to human-quality speech from text via the TTS API. There are 6 preset voices available. TTS comes in 2 models, TTS-1 and TTS-1-HD. TTS-1 is designed for real-time use cases while TTS-1-HD is designed to optimize quality.

Updated Function Calling

OpenAI recognizes the importance of accurate and efficient function calling in AI applications. To address this, the company announced significant improvements to function calling capabilities. GPT-4 Turbo is now more adept at accurately invoking functions, reducing the need for multiple roundtrips with the model. This enhancement streamlines the development process and improves the overall efficiency of AI-powered applications. Developers can now leverage the improved function calling capabilities to create more sophisticated and dynamic interactions with their AI models, resulting in enhanced user experiences.

Improved Instruction Following and JSON Mode

With GPT-4 Turbo, OpenAI has made advancements in instruction following tasks, ensuring that the model excels at generating specific formats and adhering to instructions. Whether it's generating responses in XML or following other structured formats, GPT-4 Turbo outperforms its predecessors, making it an ideal choice for applications that require meticulous instruction adherence. Additionally, OpenAI introduced JSON mode, which enables developers to constrain the model's output to generate valid JSON objects. JSON mode is particularly useful for developers utilizing the Chat Completions API outside of function calling, providing greater flexibility and control over the generated outputs.

Reproducible Outputs and Log Probabilities

OpenAI understands the importance of reproducibility and control in the development process. To address this, OpenAI introduced the seed parameter, allowing developers to obtain consistent outputs from the model. This feature is invaluable for debugging, writing comprehensive unit tests, and ensuring a higher degree of control over the model's behavior. OpenAI has been utilizing this feature internally for its own unit tests and is excited to see how developers will leverage it. Additionally, OpenAI announced the upcoming release of log probabilities for the most likely output tokens generated by GPT-4 Turbo and GPT-3.5 Turbo. This feature will be particularly useful for building features such as autocomplete in search experiences, further enhancing the usability and functionality of AI-powered applications.

GPT-3.5 Turbo

In addition to the advancements in GPT-4 Turbo, OpenAI also unveiled an updated version of GPT-3.5 Turbo. This upgraded model supports a 16K context window by default, providing developers with additional flexibility and expanded capabilities. GPT-3.5 Turbo showcases improved instruction following, JSON mode support, and parallel function calling. OpenAI's internal evaluations have demonstrated a 38% improvement on format following tasks, such as generating JSON, XML, and YAML. With the release of GPT-3.5 Turbo, developers can take advantage of these enhancements and benefit from reduced pricing. OpenAI has made input token pricing 3x cheaper compared to the previous 16K model, and output token pricing 2x cheaper, offering developers increased value and affordability.

GPTs

OpenAI's Assistants API enables developers to build AI-powered assistants with custom instructions, personalized knowledge, and the ability to call models and tools. This API extends the capabilities of OpenAI's GPT models, empowering developers to create highly tailored and domain-specific AI applications. The Assistants API includes features such as Code Interpreter, which enables assistants to write and run Python code in a sandboxed execution environment, providing the ability to solve complex code and math problems iteratively. Additionally, the Retrieval feature allows assistants to access knowledge from external sources, augmenting their capabilities with proprietary domain data or user-provided documents. The Assistants API also incorporates function calling, enabling assistants to invoke user-defined functions and incorporate the function responses in their messages. These powerful customization features allow developers to create AI-powered assistants that are tailored to their specific use cases and deliver highly personalized experiences.

Deep Customization with Fine-Tuning and Custom Models

OpenAI understands that some organizations require even deeper customization than what can be achieved through the Assistants API. To address this need, OpenAI introduced two programs: GPT-4 fine-tuning and Custom Models. The GPT-4 fine-tuning program enables developers to fine-tune the GPT-4 model, building upon its capabilities and tailoring it to specific use cases. Although GPT-4 fine-tuning requires more work compared to GPT-3.5 fine-tuning, OpenAI is actively improving the quality and safety of this program. In the future GPT’s, as they are calling the customized models, can be listed and sold in a GPT store if the model meets OpenAIs standards for inclusion.

As for the Custom Models program, selected organizations will have the opportunity to work with a dedicated group of OpenAI researchers to train a custom GPT-4 model specific to their domain. This program allows organizations to modify every step of the model training process, from additional domain-specific pre-training to a custom reinforcement learning post-training process. With these deep customization options, organizations can leverage the full potential of AI to innovate and solve complex problems in their respective fields.

Copyright Protection Program

A new Copyright Shield program was introduced to safeguard customers against infringement claims​​. Under the program OpenAI steps in and defends customers, and pays the costs incurred, if the customer faces legal claims around copyright infringement. The program applies to generally available features of ChatGPT Enterprise and the developer platform.

Michael Fauscette

Michael is an experienced high-tech leader, board chairman, software industry analyst and podcast host. He is a thought leader and published author on emerging trends in business software, artificial intelligence (AI), generative AI, digital first and customer experience strategies and technology. As a senior market researcher and leader Michael has deep experience in business software market research, starting new tech businesses and go-to-market models in large and small software companies.

Currently Michael is the Founder, CEO and Chief Analyst at Arion Research, a global cloud advisory firm; and an advisor to G2, Board Chairman at LocatorX and board member and fractional chief strategy officer for SpotLogic. Formerly the chief research officer at G2, he was responsible for helping software and services buyers use the crowdsourced insights, data, and community in the G2 marketplace. Prior to joining G2, Mr. Fauscette led IDC’s worldwide enterprise software application research group for almost ten years. He also held executive roles with seven software vendors including Autodesk, Inc. and PeopleSoft, Inc. and five technology startups.

Follow me:

@mfauscette.bsky.social

@mfauscette@techhub.social

@ www.twitter.com/mfauscette

www.linkedin.com/mfauscette

https://arionresearch.com
Previous
Previous

Generative AI Trends for 2024

Next
Next

Disambiguation Podcast - Automation and AI for Data Management - Transcript