Available for collab — multimodal at Cohere

founder·scientist·multimodal @ Cohere

Built Equall — legal AI, VC-backed, SaulLM. Built the CentraleSupélec NLP lab from zero. Now training frontier multimodal at Cohere.

Research → 🏆 AAAI 22 Best Paper NeurIPS ICML ICLR JMLR TMLR TACL ACL EMNLP COLM
Pierre Colombo
zero → company
Co-founded Equall. First legal-AI company on a domain LLM. Raised VC. Trained SaulLM (7B → 141B). Shipped to Big Law.
zero → lab
Founded the CentraleSupélec NLP lab. 6 PhDs now at Mistral, Meta FAIR, Reflection AI, Sword Health. $6.1M in compute.
paper → production
ColPali in production at NVIDIA, Jina & Amazon. State of AI Report 2024. Open European models — frontier of their categories.
01 What I built
Jan 2026 → now

Member of Technical Staff · Multimodal

Cohere
  • Training frontier vision-language models at production scale
  • End-to-end: data · training · evaluation
Jul 2023 → Dec 2025

Co-Founder & Chief Science Officer

Equall · Paris / NYC
  • First legal-AI company on a domain LLM
  • Raised VC. Built science + engineering from zero
  • Trained SaulLM (7B → 141B). 512 GPUs · 500B+ legal tokens
  • Shipped M&A due-diligence to Big Law. Led the pivot
Sep 2022 → Dec 2025

Associate Professor · Founder of the NLP Group

CentraleSupélec · Université Paris-Saclay
  • Founded the NLP lab. Built it from zero
  • Raised $6.1M in compute (Jean Zay · MareNostrum · Adastra) + $1M industrial
  • Advised 6 PhDs → Mistral · Meta FAIR · Reflection AI · Sword Health
  • Co-organized ACL 2024. 700+ teaching hours across CS, Télécom Paris, EPFL, IP Paris
02 What shipped
Multimodal data
Document · video · audio

Models for documents, video and audio. ColPali — visual document retrieval used in production at NVIDIA, Jina, Amazon. Recognized as a multimodal breakthrough in the State of AI Report 2024. BidirLM pushes it to omnimodal encoders (text · image · audio).

Legal AI
Domain LLM · product · agents

SaulLM (7B → 141B) — the first LLM line built for legal. Powered Equall's M&A due-diligence product shipped to Big Law. Now extended to autonomous legal agents.

Open LLMs from Europe
Open multilingual models

Open multilingual models from Europe — from a bilingual French-English LLM to 22B-scale generation. Plus a contribution to BLOOM (176B), the open-science landmark.

Evaluation
Methods · awarded

Metrics and benchmarking for generative AI. InfoLM — 🏆 AAAI 2022 Outstanding Student Paper, 1 of 9,020 submissions.

→ full list on Google Scholar (60+ papers)

6.8K
Citations
35
h-index
60+
Papers
🏆
AAAI Best Student Paper
The numbers are real — but they're the byproduct, not the point. Full record →
03 People I trained

PhDs advised at the CentraleSupélec NLP lab — and where they landed.

→ Graduated
Maxime Darrin
2021–2025 · w/ P. Piantanida
Research Scientist · Mistral
Nuno M. Guerreiro
defended 2025 · w/ A. Martins
Principal RS · Sword Health
Duarte M. Alves
2023–2025 · w/ A. Martins
Reflection AI
● Ongoing
Manuel Faysse
since 2022 · w/ C. Hudelot
in progress
Nicolas Boizard
since 2023 · w/ C. Hudelot
in progress
Hippolyte Gisserot-Boukhlef
since 2023 · w/ C. Hudelot
in progress