The repository provides code for running inference with the SegmentAnything Model (SAM), links for downloading the trained model checkpoints, and example notebooks that show how to use the model.
#向量搜索引擎#向量相似性搜索库,为稠密向量提供高效相似度搜索和聚类
Detectron2 is a platform for object detection, segmentation and other visual recognition tasks.
Fairseq 是一个Python编写的 Seq2seq 建模工具包,可用于翻译、摘要、语言建模和其他文本生成任务训练自定义模型
Open-Sora: 完全开源的高效复现类Sora视频生成方案
AniPortrait: Audio-Driven Synthesis of Photorealistic Portrait Animation
GeneFace++: Generalized and Stable Real-Time 3D Talking Face Generation; Official Code
Emote Portrait Alive: Generating Expressive Portrait Videos with Audio2Video Diffusion Model under Weak Conditions
#大语言模型#利用AI大模型,一键生成高清短视频 Generate short videos with one click using AI LLM.
This project aim to reproduce Sora (Open AI T2V model), we wish the open source community contribute to this project.
Official implementations for paper: DreamTalk: When Expressive Talking Head Generation Meets Diffusion Probabilistic Models
This is the official repository for DiffSpeaker: Speech-Driven 3D Facial Animation with Diffusion Transformer
[CVPR 2023] SadTalker:Learning Realistic 3D Motion Coefficients for Stylized Audio-Driven Single Image Talking Face Animation
Instant voice cloning by MIT and MyShell. Audio foundation model.
[CVPR 2024] This is the official source for our paper "SyncTalk: The Devil is in the Synchronization for Talking Head Synthesis"
Real3D-Portrait: One-shot Realistic 3D Talking Portrait Synthesis; ICLR 2024 Spotlight; Official code
[ECCV 2024] Champ: Controllable and Consistent Human Image Animation with 3D Parametric Guidance
http://www.facegood.cc
[WIP] Layer Diffusion for WebUI (via Forge)
Zero-Shot Speech Editing and Text-to-Speech in the Wild
0 条讨论