AI research

Reddit ends its role as a free AI training data goldmine

Summary Reddit plays a central role in training large language models. Now the social network is looking to monetize its data. OpenAI’s GPT-3.5 or GPT-4, Meta’s LLaMA, or Google’s Bard – large language models are trained on Internet text, and a significant portion of the training data comes from Reddit threads. The fact that this …

Reddit ends its role as a free AI training data goldmine Read More »

Microsoft edges closer to zero-shot voice cloning

Summary Microsoft presents NaturalSpeech 2, a text-to-speech model that is based on diffusion models and is capable of cloning any voice with just a short snippet of audio. Microsoft Research Asia and Microsoft Azure Speech developed NaturalSpeech 2 using a diffusion model that interacts with a Neural Audio codec, which compresses waveforms into vectors. The …

Microsoft edges closer to zero-shot voice cloning Read More »

Nvidia shows text-to-video for Stable Diffusion

Summary Nvidia turns Stable Diffusion into a text-to-video model, generates high-resolution video, and shows how the model can be personalized. Nvidia’s generative AI model is based on diffusion models and adds a temporal dimension that enables temporal-aligned image synthesis over multiple frames. The team trains a video model to generate several minutes of video of …

Nvidia shows text-to-video for Stable Diffusion Read More »

An old AI architecture shows off some new tricks

Summary GigaGAN shows that Generative Adversarial Networks are far from obsolete and could be a faster alternative to Stable Diffusion in the future. Current generative AI models for images are diffusion models trained on large datasets that generate images based on text descriptions. They have replaced GANs (Generative Adversarial Network), which were widely used in …

An old AI architecture shows off some new tricks Read More »

Metas DINOv2 is a foundation model for computer vision

Summary Metas DINOv2 is a foundation model for computer vision. The company shows its strengths and wants to combine DINOv2 with large language models. In May 2021, AI researchers at Meta presented DINO (Self-Distillation with no labels), a self-supervised trained AI model for image tasks such as classification or segmentation. With DINOv2, Meta is now …

Metas DINOv2 is a foundation model for computer vision Read More »

Instruct-NeRF2NeRF lets you edit NeRFs via text prompt

Summary Instruct-NeRF2NeRF uses methods of generative AI models and can edit 3D scenes according to text input. Earlier this year, researchers at the University of California Berkeley demonstrated InstructPix2Pix, a method that allows users to edit images in Stable Diffusion using text instructions. The method makes it possible to replace objects in images or change …

Instruct-NeRF2NeRF lets you edit NeRFs via text prompt Read More »

Zip-NeRF is another step towards a digital time machine

Summary People take photos for many reasons, one of which is to capture memories. The next generation of keepsake photos may be NeRFs, which get a quality upgrade at high speed with Zip-NeRF. Google researchers demonstrate Zip-NeRF, a NeRF model that combines the advantages of grid-based techniques and the Mipmap-based mip-NeRF 360. Grid-based NeRF methods, …

Zip-NeRF is another step towards a digital time machine Read More »

OpenAI CEO sees ‘end of an era’ in number of parameters

Newsletter In recent years, the potential progress of large language models has been measured primarily by the number of parameters. Sam Altman, CEO of OpenAI, believes that this practice is no longer useful. Altman compares the race to increase the number of parameters in large language models to the race to increase the clock speed …

OpenAI CEO sees ‘end of an era’ in number of parameters Read More »

Scroll to Top