Paper2Video – An academic paper-to-presentation video generation project developed by the National University of Singapore

AI Tools updated 5d ago dongdong
58 0

What is Paper2Video?

Paper2Video is a project developed by the Show Lab at the National University of Singapore (NUS) that automatically generates presentation videos from academic papers. Using the PaperTalker multi-agent framework, it converts research papers into complete presentation videos containing slides, subtitles, speech, and a speaker avatar. The framework consists of four modules — the Slide Builder, Subtitle Builder, Cursor Builder, and Speaker Builder — which handle slide generation, subtitle creation, cursor positioning, and speaker video synthesis respectively.

Paper2Video also provides the first high-quality academic presentation video benchmark, featuring 101 research papers along with corresponding author presentation videos, slides, and other materials. The benchmark introduces four evaluation metrics — Meta Similarity, PresentArena, PresentQuiz, and IP Memory — to measure how effectively the video conveys the core ideas of the paper, how understandable it is, how well it highlights the author’s contributions, and how much it enhances the paper’s overall impact.

Paper2Video – An academic paper-to-presentation video generation project developed by the National University of Singapore


Key Features of Paper2Video

  • Automated Video Generation:
    Automatically generates presentation videos from academic papers, transforming complex scholarly content into visually and audibly engaging formats.

  • Multi-Agent Framework:
    Powered by the PaperTalker framework, integrating modules for slide generation, subtitle creation, cursor movement planning, speech synthesis, and speaker avatar rendering to produce high-quality videos efficiently.

  • High-Quality Benchmark Dataset:
    Offers a benchmark dataset containing 101 papers with their author presentation videos and slides, establishing a standard for research and evaluation of academic presentation videos.

  • Custom Evaluation Metrics:
    Introduces four evaluation metrics — Meta Similarity, PresentArena, PresentQuiz, and IP Memory — to assess presentation quality and effectiveness from multiple dimensions.

  • User-Friendly Tools:
    Provides complete source code and detailed documentation, allowing researchers and developers to easily generate their own presentation videos.


Technical Principles of Paper2Video

  • Slide Generation and Optimization:
    Extracts content from LaTeX source files of papers to produce draft slides in Beamer format. It uses a “tree search visual selection” approach to optimize layout — generating multiple layout candidates and letting a Vision-Language Model (VLM) select the best design.

  • Subtitle and Cursor Generation:
    Creates corresponding narration scripts (subtitles) for each slide and plans cursor trajectories that simulate the speaker’s pointing gestures during explanations. Cursor movement and speech are temporally and spatially aligned to guide viewer attention.

  • Speaker Generation:
    Using a portrait photo and a short audio sample of the author, Paper2Video synthesizes a lifelike virtual speaker whose facial movements and lip sync match the speech, via text-to-speech (TTS) and talking-head generation technologies.

  • Parallelized Processing:
    Splits video generation by slides and processes them in parallel, significantly reducing total generation time.


Project Links


Use Cases of Paper2Video

  • Academic Conferences:
    Helps researchers quickly generate high-quality presentation videos, saving preparation time and enhancing presentation delivery.

  • Online Education:
    Enables educators to transform academic papers into engaging video lectures, improving interactivity and learner engagement.

  • Social Media Outreach:
    Allows research findings to be shared in an accessible, video-based format across social platforms, broadening academic impact.

  • Academic Reporting:
    Facilitates rapid production of research report videos for internal presentations or public lectures.

  • Research Promotion:
    Offers institutions and scholars an innovative way to showcase their research, boosting visibility and public awareness.

© Copyright Notice

Related Posts

No comments yet...

none
No comments yet...