OpenCV founders launch AI video startup to take on OpenAI and Google

A groundbreaking AI startup, launched by the original developers behind , has unveiled a cutting-edge technology capable of producing highly realistic, human-focused videos lasting up to five minutes-a significant advancement compared to competitors like OpenAI’s and Google’s .

Debuting with $2 million in seed funding, introduces Model 2.0, a revolutionary video generation platform that overcomes a major hurdle in AI video creation: length. While OpenAI’s maxes out at 25 seconds and most rivals generate clips under 10 seconds, CraftStory’s innovation enables the creation of seamless, coherent videos comparable in duration to typical YouTube tutorials or product demos.

This leap forward holds immense potential for businesses aiming to scale video content for training, marketing, and customer education-areas where short AI-generated snippets have fallen short despite their polished visuals.

Innovative Parallel Processing: The Key to Extended AI Video Generation

CraftStory’s breakthrough hinges on a novel parallelized diffusion architecture, which fundamentally rethinks how AI models generate video compared to the sequential techniques most competitors use.

Conventional video generation relies on diffusion algorithms applied to expanding 3D volumes, where time is the third dimension. Extending video length demands exponentially larger models, more extensive datasets, and heavier computational power.

In contrast, CraftStory’s approach runs multiple smaller diffusion processes simultaneously across the entire video timeline, linked by bidirectional constraints. This means later frames can influence earlier ones, preventing error accumulation that typically occurs when videos are generated sequentially.

Instead of producing short segments and stitching them together, CraftStory’s system processes the full five-minute video concurrently, ensuring consistency and quality throughout.

Moreover, the company’s training data is sourced from proprietary footage rather than internet-scraped videos. By collaborating with professional studios to film actors using high-frame-rate cameras, CraftStory captures crisp, detailed motion-including subtle hand movements-avoiding the motion blur common in standard 30fps online videos.

“Our results demonstrate that high-quality data, not massive datasets or budgets, is the cornerstone of superior video generation,” said Victor Erukhimov, CraftStory’s CEO.

Currently, Model 2.0 operates as a video-to-video system: users upload a still image to animate alongside a “driving video” featuring a person whose movements the AI mimics. CraftStory offers a library of professionally recorded driving videos, with actors receiving royalties when their motion data is utilized, or users can supply their own footage.

The platform generates 30-second low-resolution clips in about 15 minutes. Advanced lip-sync technology aligns mouth movements with scripts or audio, while gesture synchronization ensures body language matches speech rhythm and emotional tone.

Competing with Giants on a Lean Budget

CraftStory’s $2 million funding round was primarily backed by , who previously sold his project management software company Wrike to Citrix for in 2021 and now leads an AI coding startup. This modest capital contrasts sharply with the billions invested in competitors-OpenAI alone raised in its latest round.

Erukhimov challenges the belief that enormous funding is essential for success. “While compute resources help, throwing billions at a concept without substance benefits no one,” he remarked.

Investor Filev supports this lean, focused approach: “Investing in startups is a bet on people. As Margaret Mead famously said, never underestimate what a small, dedicated team can achieve.”

Filev highlights CraftStory’s strategic focus: “While large labs race to build broad, general-purpose video models, CraftStory dives deep into a specialized niche-long-form, engaging, human-centric video.”

The Importance of Computer Vision Expertise in AI Video

Erukhimov’s background in computer vision, rather than transformer-based AI, sets him apart. He was an early contributor to OpenCV, the open-source computer vision library widely adopted across industries, boasting over 50,000 citations.

When Intel scaled back OpenCV support in the mid-2000s, Erukhimov co-founded Itseez to maintain and enhance the library, expanding its applications into automotive safety before Intel reacquired the company in 2016.

Filev notes this expertise is critical for video generation: “Generative AI video isn’t just about creating images; it requires deep understanding of motion, facial expressions, temporal consistency, and human movement dynamics-areas where Victor excels.”

Targeting Enterprise Needs: Training and Product Demonstrations

Unlike many AI video startups focusing on consumer creativity, CraftStory prioritizes enterprise applications.

“Our primary market is B2B, especially software companies needing engaging training, product, and launch videos,” Erukhimov explained.

Longer videos are essential in these contexts, as brief clips cannot adequately showcase complex software features or detailed tutorials.

“For videos up to five minutes with consistent quality, our platform is the ideal choice,” Erukhimov added.

Filev agrees, emphasizing the market gap: “Short clips, no matter how polished, don’t suffice for commercial use. Businesses need videos lasting 30 seconds to several minutes.”

CraftStory anticipates significant cost reductions for clients. Filev estimates that “small businesses could produce content in minutes that previously cost tens of thousands and took months.”

The startup also appeals to creative agencies producing corporate videos, offering a faster, more affordable alternative to traditional multi-day shoots by transforming actor footage into AI-generated content.

Looking ahead, CraftStory plans to launch a text-to-video model enabling users to generate long-form videos directly from scripts. They are also developing support for dynamic camera movements, including the popular “walk-and-talk” style seen in premium advertising.

Positioning Within a Diverse and Competitive Market

CraftStory enters a bustling AI video generation landscape. OpenAI’s has garnered attention despite limited public access, while Google’s and platforms like , , and offer varied video generation capabilities.

Erukhimov acknowledges the competition but stresses CraftStory’s unique focus on human-centric, long-form video content. The company’s strategy centers on rapid innovation and market penetration rather than building insurmountable technical barriers.

Filev envisions the market segmenting into layers: major tech firms provide powerful, general-purpose generation APIs, while specialized companies like CraftStory build tailored production pipelines and workflows atop these engines.

Model 2.0 is currently accessible at app.craftstory.com/model-2.0, with early access available for users and enterprises eager to explore the technology. Although competing against well-funded incumbents is challenging, Erukhimov remains optimistic.

“AI-generated video is poised to become the dominant medium for corporate storytelling,” he concluded.

More from this stream

Recomended