AI research

Metas DINOv2 is a foundation model for computer vision

Summary Metas DINOv2 is a foundation model for computer vision. The company shows its strengths and wants to combine DINOv2 with large language models. In May 2021, AI researchers at Meta presented DINO (Self-Distillation with no labels), a self-supervised trained AI model for image tasks such as classification or segmentation. With DINOv2, Meta is now …

Metas DINOv2 is a foundation model for computer vision Read More »

Instruct-NeRF2NeRF lets you edit NeRFs via text prompt

Summary Instruct-NeRF2NeRF uses methods of generative AI models and can edit 3D scenes according to text input. Earlier this year, researchers at the University of California Berkeley demonstrated InstructPix2Pix, a method that allows users to edit images in Stable Diffusion using text instructions. The method makes it possible to replace objects in images or change …

Instruct-NeRF2NeRF lets you edit NeRFs via text prompt Read More »

Zip-NeRF is another step towards a digital time machine

Summary People take photos for many reasons, one of which is to capture memories. The next generation of keepsake photos may be NeRFs, which get a quality upgrade at high speed with Zip-NeRF. Google researchers demonstrate Zip-NeRF, a NeRF model that combines the advantages of grid-based techniques and the Mipmap-based mip-NeRF 360. Grid-based NeRF methods, …

Zip-NeRF is another step towards a digital time machine Read More »

OpenAI CEO sees ‘end of an era’ in number of parameters

Newsletter In recent years, the potential progress of large language models has been measured primarily by the number of parameters. Sam Altman, CEO of OpenAI, believes that this practice is no longer useful. Altman compares the race to increase the number of parameters in large language models to the race to increase the clock speed …

OpenAI CEO sees ‘end of an era’ in number of parameters Read More »

Google’s medical language model “Med-PaLM 2” enters pilot phase with first customers

Summary Google is rolling out Med-PaLM 2 on a limited basis for initial testing. Update April 14, 2023: Google Cloud announces that Med-PaLM 2 will be rolled out to select Google Cloud customers for a “limited test” in the coming weeks. The goal, the company says, is to explore safe, responsible and meaningful use scenarios. …

Google’s medical language model “Med-PaLM 2” enters pilot phase with first customers Read More »

Here’s how OpenAI’s DALL-E 3 could leapfrog the competition

Summary All generative AI models for images currently use diffusion models. OpenAI presents an alternative that is significantly faster and could power new models like DALL-E 3. DALL-E 2, Stable Diffusion, or Midjourney use diffusion models that gradually synthesize an image from noise during image generation. The same iterative process is used in audio or …

Here’s how OpenAI’s DALL-E 3 could leapfrog the competition Read More »

An image model at Midjourney’s level?

Summary A new beta version of Stable Diffusion delivers much more aesthetic and photorealistic results than the previous version. Will this make commercial offerings obsolete? While Stable Diffusion is the most developed open-source image model, it can’t always match the quality and especially the accessibility of commercial competitors like Midjourney. Its strength so far is …

An image model at Midjourney’s level? Read More »

Google’s medical language model Med-PaLM 2 passes exam questions

Summary Med-PaLM is Google’s variant of the PaLM language model optimized for medical questions. The latest version is designed to answer medical questions reliably at an expert level. Last December, Google unveiled Med-PaLM, a version of Google’s giant PaLM (Pathways Language Model) language model optimized for answering medical questions. Med-PaLM was developed using a special …

Google’s medical language model Med-PaLM 2 passes exam questions Read More »

LERF is like Google for the Metaverse

Summary Neural Radiance Fields (NeRFs) are a promising graphics technology that can transform the real world into 3D relatively quickly and with high quality. LERF (Language Embedded Radiance Fields) integrates the capabilities of large language models into NeRFs. This enables accurate 3D object recognition without additional training. UC Berkeley researchers present LERF, which volumetrically integrates …

LERF is like Google for the Metaverse Read More »

OpenAI kills its Codex code model, recommends GPT3.5 instead

Summary OpenAI CEO Sam Altman announced that scientists will continue to have access to the model after a wave of criticism over the Codex shutdown. “We didn’t realize how much people liked this model; we will continue to support it for researchers!” he wrote on Twitter. Only a few days after a first notice, OpenAI …

OpenAI kills its Codex code model, recommends GPT3.5 instead Read More »

Scroll to Top