Efficient and Autonomous AI Inference (TURING Project)

Scalable, trustworthy, and deployable AI inference for real-world systems

Overview

Modern AI models are powerful but computationally heavy.
Deploying them in real-world systems — especially industrial and edge environments — requires careful optimization of latency, memory, and energy consumption.

Under the Horizon Europe TURING project, this research focuses on:

  • Efficient model inference
  • Autonomous AI execution
  • Scalable and trustworthy deployment pipelines
  • Real-world industrial constraints

The work is conducted at NTUA (ICCS – Institute of Communication and Computer Systems).


Problem

Large AI models face several deployment challenges:

  • High latency during inference
  • Large memory footprint
  • Energy inefficiency at the edge
  • Limited hardware resources
  • Trust and reliability concerns

We ask:

How can AI models adapt dynamically to resource constraints while maintaining performance and reliability?


Research Directions

This project explores:

  • Efficient inference architectures
  • Model compression and optimization
  • Dynamic computation (e.g., conditional execution, early exits)
  • Scalable deployment strategies
  • Edge and industrial AI systems
  • Reproducible AI pipelines for real-world environments

The broader goal is to move from static AI models to adaptive and autonomous AI systems.


Contributions (Ongoing)

  • Studying scalable inference strategies for large models
  • Exploring compute-aware AI design
  • Investigating deployment-aware model optimization
  • Aligning AI systems with real-world latency and resource constraints

Impact

  • Enables deployable AI for industrial systems
  • Reduces inference latency and computational cost
  • Supports trustworthy and scalable AI deployment
  • Advances edge intelligence for connected environments

Vision

The long-term objective is to design AI systems that:

  • Understand their execution environment
  • Adapt computation dynamically
  • Maintain reliability under constraints
  • Bridge theory and deployable systems engineering