Countries around the world are pursuing sovereign AI, using their own computing infrastructure, data, workforce, and business networks to generate artificial intelligence, ensuring AI systems are consistent with local values, laws, and interests.
To support these efforts, NVIDIA today announced the availability of four new NVIDIA NIM microservices that make it easier for developers to build and deploy high-performance generative AI applications.
The microservices support the popular community model, customized to local needs, enhancing user interactions through accurate understanding and improved responses based on local language and cultural heritage.
In the Asia-Pacific region alone, generative AI software revenue is expected to reach $48 billion by 2030, up from $5 billion this year, according to ABI Research.
Llama-3-Swallow-70B, trained on Japanese data, and Llama-3-Taiwan-70B, trained on Chinese data, are regional language models that can provide a deeper understanding of local laws, regulations, and other practices.
Built on Mistral-7B, the RakutenAI 7B family of models are trained on English and Japanese datasets and are available as two different NIM microservices: Chat and Instruct. Rakuten's Foundation and Instruct models achieved top scores among open Japanese large-scale language models and achieved the highest average scores in the LM Evaluation Harness benchmark conducted from January to March 2024.
Training large-scale language models (LLMs) in regional languages can better understand and reflect cultural and linguistic nuances, ensuring more accurate and nuanced communication and improving the effectiveness of the output.
Compared to baseline LLMs such as Llama 3, these models perform better in Japanese and Chinese language understanding, local legal tasks, question answering, and language translation and summarization.
Countries around the world are investing in their own AI infrastructure, from Singapore, the United Arab Emirates, South Korea and Sweden to France, Italy and India.
The new NIM microservices enable enterprises, government agencies and universities to host native LLM in their own environments, empowering developers to build advanced copilots, chatbots and AI assistants.
Developing Applications with Sovereign AI NIM Microservices
Developers can easily deploy sovereign AI models packaged as NIM microservices into production while achieving improved performance.
The microservices available in NVIDIA AI Enterprise are optimized for inference using the NVIDIA TensorRT-LLM open-source library.
Used as the base model for the new Llama–3-Swallow-70B and Llama-3-Taiwan-70B NIM microservices, the Llama 3 70B NIM microservices can achieve up to 5x higher throughput, reducing the overall cost of running the models in production and improving user experience with reduced latency.
The new NIM microservices are available today as a hosted application programming interface (API).
Leveraging NVIDIA NIM for Faster, More Accurate Generative AI Outcomes
NIM microservices accelerate deployment, improve overall performance, and provide the security required by organizations in industries around the world, including healthcare, finance, manufacturing, education, and legal.
Tokyo Institute of Technology used Japanese data to fine-tune Llama-3-Swallow 70B.
“LLM is not a mechanical tool that benefits everyone equally. Rather, it is an intellectual tool that interacts with human culture and creativity. The influence is reciprocal, not only is the model influenced by the data it uses to train, but our culture and the data we generate are also influenced by LLM,” said Rio Yokota, professor at the Global Academic Information and Computing Center of the Tokyo Institute of Technology. “Therefore, it is paramount that we develop our own AI models that comply with our cultural norms. With Llama-3-Swallow available as an NVIDIA NIM microservice, developers can easily access and deploy models for Japanese applications across industries.”
For example, Preferred Networks, a Japanese AI company, has used this model to develop a healthcare-specific model trained on a unique Japanese medical data corpus called Llama3-Preferred-MedSwallow-70B, which achieved the highest score on the Japanese National Medical Examination.
Chang Gung Memorial Hospital (CGMH), one of Taiwan's leading hospitals, has built a custom-made AI Inference Service (AIIS) to centrally manage all LLM applications within the hospital system, and is using Llama 3-Taiwan 70B to improve the efficiency of frontline medical staff by providing more nuanced medical terminology that patients can understand.
“AI applications built with local language LLMs can provide instant, contextual guidance to streamline workflows, support staff development and act as continuous learning tools to improve the quality of patient care,” said Dr. Changfu Kuo, director of the Center for Artificial Intelligence in Medicine at CGMH's Linko branch. “NVIDIA NIMs simplify the development of these applications, making models trained in local languages easily accessible and deployable with minimal engineering expertise.”
Pegatron, a Taiwan-based electronics manufacturer, will adopt Llama 3-Taiwan 70B NIM microservices for internal and external applications, which the company will integrate with the PEGAAi Agentic AI System to automate processes and improve manufacturing and operational efficiency.
Llama-3-Taiwan 70B NIM is also used by global petrochemical manufacturer Chang Chun Group, the world's leading printed circuit board manufacturer Unimicron, technology focused media company TechOrange, online contract services company LegalSign.ai and generative AI start-up APMIC, which are also collaborating under the open model.
Creating Custom Enterprise Models with NVIDIA AI Foundry
Regional AI models can provide culturally sensitive and localized responses, but companies need to fine-tune them to their business processes and domain expertise.
NVIDIA AI Foundry is a platform and services that includes popular foundational models, NVIDIA NeMo for fine-tuning, and dedicated capacity on NVIDIA DGX Cloud to provide developers with a full-stack solution for creating customized foundational models packaged as NIM microservices.
Additionally, developers using NVIDIA AI Foundry have access to the NVIDIA AI Enterprise software platform, which provides security, stability and support for production deployments.
NVIDIA AI Foundry provides developers with the tools they need to more quickly and easily build and deploy their own custom regional language NIM microservices that power AI applications, ensuring culturally and linguistically appropriate results for users.