Meta has just released Code Llama, a comprehensive Language Model (LLM) that can use text prompts to generate code. According to the American company, Code Llama has the potential to make workflows faster and more efficient for experienced developers while lowering the entry barrier for those who are learning to program.
In fact, the new tool appears to have been designed as a productivity and educational tool to help developers write more robust and well-documented software.
The generative AI space is evolving rapidly, and the Menlo Park-based company believes that an open approach to today’s AI is the best for developing innovative, secure, and responsible AI tools. This is why Code Llama has been released with the same community license as Llama 2.
But how does the new tool work, and what capabilities does it really have? Let’s try to find out.
How Code Llama Works:
Code Llama is a specialized code version of Llama 2, further trained on specific code datasets, sampling more data from the same set for longer periods. In essence, Code Llama offers improved coding capabilities based on Llama 2.
As a result, it can generate code and natural language about code from both code prompts and natural language (e.g., “Write me a function that returns the Fibonacci sequence”). It can also be used to complete and debug code errors. Furthermore, it supports many of the most popular languages used today, including Python, C++, Java, PHP, TypeScript (JavaScript), C#, and Bash.
It’s interesting to note that Meta is actually releasing three dimensions of Code Llama with 7 billion, 13 billion, and 34 billion parameters, respectively. Each of these models has been trained with 500 billion code tokens and related code data. The 7 billion and 13 billion base and instruction models also have a Fill-In-the-Middle (FIM) capability, allowing them to insert code into existing code, meaning they can support tasks like code completion right out of the box.
In addition to being a prerequisite for generating longer programs, having longer input sequences unlocks new and interesting use cases for a code LLM. For example, users can provide the model with more context from their codebase to make the generations more relevant. It also helps in debugging scenarios in larger codebases, where tracking all code related to a specific problem can be a challenge for developers. When developers are faced with debugging a large block of code, they can pass the entire code length to the model.
In addition to the basic model, Meta has released a Python-specific version called Code Llama-Python and another version called Code Llama-Instrct, which can understand instructions in natural language. According to Meta, each specific version of Code Llama is not interchangeable, and the company does not recommend using the basic Code Llama or Code Llama-Python models for natural language instructions.
“Programmers are already using LLMs to assist with a variety of tasks, from writing new software to resolving issues in existing code,” Meta said in a blog post. “The goal is to make developers’ workflows more efficient so they can focus on the more human-centered aspects of their work.”
Meta claims that Code Llama has outperformed publicly available LLMs based on benchmark tests. In fact, the blog phind also claimed to have further optimized CodeLlama-34B and CodeLlama-34B-Python on an internal dataset, achieving 67.6% and 69.5% pass@1 on HumanEval, respectively. In comparison, the direct competitor, GPT-4, reached 67% according to the official internal technical report from March. Additionally, the blog assures that, to ensure the validity of the results, it applied OpenAI’s decontamination methodology to the dataset.
Therefore, Code Llama could be a valuable ally for both experienced developers and those who are just starting out.