8080

Intelligence, everywhere.

Introduction

“Give me a place to stand, and I shall move the Earth.” – Archimedes

Technology is the lever with which humanity moves the world, and AI might be our biggest lever yet. It has the potential to change everything.

But there’s one huge challenge facing the next Bezos, Page, Andreessen, or Zuckerberg of the AI era: intelligence is too slow, too expensive, and too difficult to build with. These limitations are holding them – and by extension, the world – back.

We started 8080 to empower developers to create the next world-moving companies of the AI era. We are building a new AI inference cloud, built exclusively for next-generation chips. Our mission is to make intelligence as fast, cheap, and as easy to use as possible. The faster, cheaper, and easier we can make AI, the more potential applications we can make possible.

“If I have seen further, it is by standing on the shoulders of giants.” – Sir Isaac Newton

We envision a world where intelligence is everywhere, incorporated into every object, making human life easier and better every second of every day. Intelligence that is too cheap to meter and too fast to notice.

The Googles and Amazons of the AI era haven’t been built yet. They will be built on 8080, and because of 8080. Will be the shoulders upon which they stand.


Working at 8080

We are not building a normal company. We’ve done that before, and believe that in the AI era there must be a better way. We are designing 8080 to accomplish our mission and are capping headcount at 10 until we reach $100M in revenue. That might not be possible, but we are going to try. In order to do that:

Open Roles

If you’re interested in joining us, send us an email at join[at]8080[dot]io.

We’re looking for partners who want to build for building’s sake. Every partner is a full-stack builder, first, but also has, either by experience or passion, expertise that augments the rest of the team. We are looking for partners with expertise in the following areas:

API & Platform

North Star: API performance. Building high-performance APIs and applications for developers, from web applications to account provisioning, management, and billing, to API design. Leveraging extensive experience with Python, Django, FastAPI, Postgres, AWS, React, and familiarity with systems languages like Rust, C, or C++ to design for builders and consistently enhance the user experience by working backwards from the customer.

Infrastructure & Systems

North Star: chip utilization and latency. Constructing state-of-the-art LLM inference infrastructure from scratch that handles millions of requests per second and maximizes hardware utilization. Leveraging expertise in high-performance, concurrent, and distributed systems; proficiency in system programming languages like Rust, C++, or Zig; and experience with Postgres, AWS, Redis, Kafka, Zipkin, or Jaeger to architect a robust, scalable backend that integrates seamlessly with novel hardware and API services.

AI

North Star: model output quality. Training and fine-tuning LLMs while obsessing over output quality. Experimenting with giving users new chain-of-thought and reasoning tools and harnessing capabilities that are only possible on our proprietary hardware. Utilizing expertise in AI frameworks, model optimization, and integration to advance AI capabilities and ensure exceptional results that seamlessly integrate with the platforms on which they run. This includes close collaboration with our hardware and systems teams to unlock unprecedented performance and accuracy.

Hardware Infrastructure

North Star: inference capacity, latency, and cost. Managing all hardware and data center deployments from the PCIe card level to the load balancer boundary. Working with proprietary LLM inference accelerators on each PCIe card, developing and maintaining drivers in Linux, and integrating telemetry and remote management for monitoring and control. Overseeing power distribution, cooling, and CPU resource allocation in each 4U chassis, while coordinating rack-level power, network architecture with redundant PDUs, and top-of-rack switches. Designing and implementing an aggregation switch fabric to interconnect five racks, linking to a core switch/router at the data center edge. Planning for dark fiber or AWS Direct Connect to connect multiple data centers. Ensuring robust orchestration and balancing of inference workloads, possibly using Kubernetes, SLURM, or a custom solution, and providing a clear path for high-bandwidth, low-latency connectivity across our entire server footprint. This role demands a holistic view of hardware and network architecture that enables everything else to run at peak performance.

Finance

North Star: cost per token. Managing and streamlining the entire finances of a company that purchases and manages server hardware, including costs, capital expenditures, pricing strategies, procurement, and debt leverage. Building automated systems to scale to hundreds of millions in revenue with very few people, leveraging analytical skills to provide the lowest possible prices to customers while maintaining financial efficiency and controlling the end-to-end flow of capital.

Developer Experience

North Star: time to value. Crafting and enhancing all aspects of developer tooling and experience—from CLIs, documentation, and libraries to demos and community engagement. Building automation to support millions of developers, leveraging a passion for improving the ease with which they can build, thereby fostering a vibrant developer community.

Automation & Tooling

North Star: total headcount / total revenue. Designing and implementing automated workflows, processes, and internal applications to streamline operations and enable scaling a small team to hundreds of millions in revenue. Focusing on minimizing human effort by building internal tools, automating tasks, and ensuring all internal systems are efficient, cohesive, and scalable.


FAQ

What kind of performance can I expect?

Metric Est.
Input Tokens Per Second 250,000
Output Tokens Per Second 25,000
Time to First Token 20 ms

How much will it cost?

Our goal is to make intelligence too cheap to meter. As of now, we’re targeting to charge less than $0.05 per million tokens, regardless of input or output, fine-tuned or not.

Why are you called 8080?

How can I learn more?

We’ll be adding more detail here as we get closer to launch. Until then, you can add your email here to stay in touch, and follow us on X.