US startup announces introducing 120 billion VND AI model into pocket devices

Cát Tiên | 09/01/2026 09:06

US startup Tiiny AI announced the introduction of a 120 billion-parameter AI artificial intelligence model into pocket devices, without clouds or high-end GPUs.

Over the years, the development of artificial intelligence (AI) has been associated with increasing scale, more data, more parameters and greater computing power.

Therefore, large-language (LLM) models can often only operate on expensive data centers, heavily dependent on cloud computing and dedicated GPUs.

However, a US startup is challenging this approach by incorporating large-scale AI into a pocket-pocketable device.

Tiiny AI Inc. recently introduced Tiiny AI Pocket Lab, recognized by Guinness World Records as the world's smallest personal AI supercomputer in the "smallest mini computer running 100 LLM locally" category.

According to the company, this is the first pocket device that can fully run the LLM model with up to 120 billion direct parameters on the device, without cloud, server or high-end GPU connection.

Ambition to bring powerful AI closer to individual users

In the vision statement, Tiiny AI emphasized the goal of bringing advanced AI out of giant data centers and to individual hands.

The company believes that the biggest bottleneck of the current AI ecosystem is not the lack of computing power, but dependence on the cloud, leading to high costs, high latency and privacy risks.

Tiiny AI Pocket Lab is about 14.2 x 8 x 2.53 cm in size, weighs about 300 grams, but is designed as a complete AI inference system.

The device operates at a power level of about 65W, significantly lower than traditional GPU-based AI systems, which consume a very large amount of energy.

Hardware configuration and notable performances

According to the announcement, Pocket Lab is equipped with a 12-core ARMv9.2 CPU, integrated with a dedicated neural processor (NPU), reaching about 190 TOPS AI computing capabilities. The device comes with 80GB of LPDDR5X memory and 1TB of storage, allowing processing large models right on the machine.

Tiiny AI said Pocket Lab operates most effectively in the "golden zone" of personal AI, equivalent to models from 10 to 100 billion parameters, which is considered to meet more than 80% of actual needs.

The company claims that the inference performance of the device can reach levels equivalent to GPT-4o, sufficient for multi-step analysis, deep contextual understanding and complex inference tasks.

Core technology behind the device

The ability to run large models on a compact device comes from two main technologies: TurboSparse and PowerInfer.

TurboSparse uses narrow activation technique at the neuron level, which significantly reduces the amount of calculation needed in the inference process.

Meanwhile, PowerInfer is an open source inference tool that allows flexible workload distribution between CPU and NPU, optimizing performance without a separate GPU.

Thanks to this combination, tasks that previously required thousands of USD worth of GPU can now be performed on a pocket device.

Open ecosystem and towards CES 2026

Tiiny AI Pocket Lab supports installing open source models with just one click, including GPT-OSS, Qwen, DeepSeek, Llama, Phi, Mistral... The device is also compatible with many open source AI agents such as OpenManus, ComfyUI, Flowise or SillyTavern.

The company said users will receive continuous updates, including hardware upgrades via OTA, and is expected to demonstrate fully at CES in January 2026.

Cát Tiên