The Costs of Leveraging the Power of GPT4

OpenAI reveals GPT-4, with image and text processing with human-level performance, achieving top 10% on a simulated bar exam and better steerability and safety
Lior Penso
Lior Penso
Co-founder & COO at Qwak
March 19, 2023
The Costs of Leveraging the Power of GPT4

OpenAI has introduced GPT-4, a large multimodal model designed for accepting image and text inputs and emitting text outputs. The latest model demonstrates human-level performance on various professional and academic benchmarks, such as passing a simulated bar exam with a score in the top 10% of test takers. The company spent six months iteratively aligning GPT-4, resulting in improved results on factuality, steerability, and not going outside of guardrails. 

To make GPT-4 accessible to the wider public, OpenAI is releasing its text input capability through ChatGPT and the API, while collaborating closely with a single partner to prepare for the release of the image input capability. Additionally, OpenAI is open-sourcing their framework for automated evaluation of AI model performance, OpenAI Evals, to encourage anyone to report shortcomings in their models and guide further improvements.

In terms of its capabilities, GPT-4 outperforms its predecessor, GPT-3.5, particularly when faced with complex tasks that require greater creativity and nuance. OpenAI tested the two models on various benchmarks, including simulated exams designed for humans, and found that GPT-4 consistently performed better. While a minority of the problems in the exams were seen by the model during training, OpenAI believes the results are representative and encourages further investigation in their technical report.

OpenAI’s GPT-4 Technical Report, 2023


The GPT-4 API can be accessed by signing up for a waitlist and then being invited as capacity is scaled up. Developers can make text-only requests to the model and pricing is $0.03 per 1k prompt tokens and $0.06 per 1k completion tokens. Default rate limits are 40k tokens per minute and 200 requests per minute. The context length of GPT-4 is 8,192 tokens, but limited access is provided for the 32,768-context version, which has a pricing of $0.06 per 1k prompt tokens and $0.12 per 1k completion tokens. The model will be updated automatically over time, and requests for the 8K and 32K engines are processed at different rates based on capacity.

OpenAI’s GPT-4 Pricing Page, 2023

Should you buy the API or build your own model?

Deciding whether to use GPT-4 or build a custom model depends on the specific needs and resources of the company. GPT-4 can be useful for companies that require a large, pre-trained language model to generate human-like text and complete various tasks with high accuracy and efficiency. This can be beneficial for applications such as chatbots, content generation, and language translation.

However, if a company has specialized needs or unique data sets, building a custom model may be more appropriate. This can allow for greater control over the specific features and functions of the model, and may result in more accurate and tailored results. Companies with significant resources and expertise in machine learning may find it more cost-effective and efficient to build their own models rather than rely on pre-existing models such as GPT-4.

Another factor to consider is cost. The cost of using GPT-4 will depend on several factors, including the level of access required and the volume of usage. If a company uses the GPT-4 API for generating a large volume of prompts and completions, the cost can add up quickly. In general,  the cost of using generative AI platforms is complex and involves multiple parties and parameters. In addition to the subscription fees, there may be additional costs associated with data storage, infrastructure, and technical expertise required to build and deploy AI models that utilize the GPT-4 API. These additional costs can vary widely depending on the company's specific needs and requirements.

On the other hand, building your own model can also be complex. It requires significant resources, including time, money, and expertise in data science, engineering and machine learning operations. Building your own model may be the best option if you require a highly specialized model or if you have access to large amounts of data that can be used to train the model.

Building Your Own? Go From Research to Production With MLOps

If a company decides to build its own machine learning model instead of using a generative model's API, it is crucial to invest in MLOps. The reason for this is that MLOps provides the necessary infrastructure, tools, and processes required to ensure that the machine learning model is efficient, reliable, and scalable. Without proper MLOps practices, the development, deployment, and maintenance of machine learning models can be a chaotic and error-prone process, leading to inefficient use of resources, inaccurate predictions, and ultimately, a waste of time and money.

MLOps ensures that the model is trained and deployed in a controlled environment and that the process is repeatable, scalable, and automated. This involves several essential pillars, including data management, model training and testing, deployment, and monitoring. Data management involves collecting, cleaning, and storing data in a manner that ensures accuracy, completeness, and consistency. Model training and testing ensure that the model is developed in a controlled environment, tested thoroughly, and validated against real-world data. Deployment ensures that the model is released to production in a controlled and automated manner, and monitoring ensures that the model's performance is continuously monitored, and any issues are resolved promptly. By investing in MLOps, companies can ensure that their machine learning models are developed and deployed efficiently, accurately, and at scale.

Hidden Technical Debt in Machine Learning Systems, D Sculley, 2015

Chat with us to see the platform live and discover how we can help simplify your journey deploying AI in production.

say goodbe to complex mlops with Qwak