feat: add comprehensive backend topics and fix build issues

## 新增内容

### 附录文档扩展
- 扩展前端项目架构文档 (frontend-project-architecture.md)
- 扩展后端项目架构文档 (backend-project-architecture.md)
- 扩展数据治理文档 (data-governance.md)
- 扩展数据可视化文档 (data-visualization.md)
- 扩展分布式系统文档 (distributed-systems.md)
- 扩展高可用文档 (high-availability.md)
- 扩展单体到微服务文档 (monolith-to-microservices.md)
- 扩展系统设计方法论文档 (system-design-methodology.md)
- 扩展 Docker 容器文档 (docker-containers.md)
- 扩展 Kubernetes 文档 (kubernetes.md)
- 扩展 Linux 基础文档 (linux-basics.md)
- 扩展神经网络文档 (neural-networks.md)

### 新增交互式组件
- 数据治理组件: DataQualityDemo, DataGovernanceFrameworkDemo, DataLineageDemo
- 数据可视化组件: ChartTypeSelectorDemo, DashboardLayoutDemo
- 分布式系统组件: CAPTheoremDemo, ConsistencyModelsDemo, DistributedChallengesDemo
- 高可用组件: AvailabilityCalculatorDemo, FailoverStrategyDemo
- 系统设计组件: SystemDesignStepsDemo, CapacityEstimationDemo
- Docker 容器组件: DockerArchitectureDemo, DockerLifecycleDemo
- Kubernetes 组件: K8sArchitectureDemo, K8sWorkloadsDemo
- Linux 基础组件: LinuxFileSystemDemo, LinuxCommandDemo, LinuxPermissionsDemo
- 神经网络组件: NeuronDemo, NetworkLayersDemo, NetworkArchitectureDemo
- 单体到微服务组件: ArchEvolutionDemo
- 项目架构组件: ProjectArchitectureComparisonDemo
- 附录导航组件: AppendixFlowMap

### 英文版重构
- 将 en-us 目录重命名为 en
- 更新相关配置和组件中的语言代码

## Bug 修复
- 修复 index.js 中重复的组件导入语句导致的 build 失败
- 恢复被注释的 InvertedIndexDemo 和 SearchRelevanceDemo 导入
- 修复 HomeFeatures.vue 中 en-us 与 config.mjs 中 en 不一致导致的语言切换问题

## 其他改进
- 添加构建脚本 (scripts/build.mjs)
- 更新依赖版本
This commit is contained in:
sanbuphy
2026-02-26 04:35:28 +08:00
parent df51f84ab5
commit ef70b1d8e1
84 changed files with 12917 additions and 3477 deletions
+186
View File
@@ -0,0 +1,186 @@
# Appendix
Welcome to the **Appendix** section! This is a collection of artificial intelligence fundamentals and full-stack development basics, serving as an important reference library during your learning journey.
## Content Categories
### AI Fundamentals
Understand the core concepts, development history, and cutting-edge technical principles of artificial intelligence:
<NavGrid>
<NavCard
href="/en-us/appendix/prompt-engineering/"
title="Prompt Engineering"
description="Master the art of efficient dialogue with AI to unlock the potential of large models"
/>
<NavCard
href="/en-us/appendix/ai-evolution"
title="AI Evolution History"
description="Review key milestones in AI development and understand the trajectory of technological evolution"
/>
<NavCard
href="/en-us/appendix/llm-intro"
title="Large Language Models"
description="Deep yet accessible explanation of how Large Language Models (LLMs) work and their applications"
/>
<NavCard
href="/en-us/appendix/vlm-intro"
title="Multimodal Large Models"
description="Explore advanced models capable of processing multiple data modalities such as images and audio"
/>
<NavCard
href="/en-us/appendix/image-gen-intro"
title="AI Image Generation Principles"
description="Uncover the underlying logic and technical implementation of AI image generation"
/>
<NavCard
href="/en-us/appendix/audio-intro"
title="AI Audio Models"
description="Understand AI applications in speech synthesis, recognition, and music generation"
/>
<NavCard
href="/en-us/appendix/context-engineering"
title="Context Engineering"
description="Learn how to optimize context management to improve long-range coherence of AI tasks"
/>
<NavCard
href="/en-us/appendix/agent-intro"
title="Agent Intelligence"
description="Explore AI agent architectures with autonomous decision-making and execution capabilities"
/>
<NavCard
href="/en-us/appendix/ai-capability-dictionary"
title="AI Capability Dictionary"
description="A quick reference handbook for commonly used terms and core concepts in the AI field"
/>
</NavGrid>
### Frontend Basics
Solidify the technical foundation of frontend development:
<NavGrid>
<NavCard
href="/en-us/appendix/web-basics"
title="HTML/CSS/JS Basics"
description="The three pillars of building web pages, essential for frontend development beginners"
/>
<NavCard
href="/en-us/appendix/frontend-evolution"
title="Frontend Evolution History"
description="Understand the evolution of frontend technology stacks and grasp technology development trends"
/>
<NavCard
href="/en-us/appendix/frontend-performance"
title="Frontend Performance Optimization"
description="Learn key strategies to improve webpage loading speed and interaction smoothness"
/>
<NavCard
href="/en-us/appendix/canvas-intro"
title="Canvas 2D Basics"
description="Master the Canvas drawing API to achieve cool graphics and animation effects"
/>
<NavCard
href="/en-us/appendix/url-to-browser"
title="From URL to Browser Display"
description="Full-chain analysis of the complete process of browser rendering pages"
/>
<NavCard
href="/en-us/appendix/browser-devtools/"
title="Browser DevTools"
description="Proficiently use developer tools to efficiently locate and solve frontend issues"
/>
</NavGrid>
### Backend Basics
Master the core concepts of backend development:
<NavGrid>
<NavCard
href="/en-us/appendix/backend-evolution"
title="Backend Evolution History"
description="From monolithic to microservices, exploring the evolution of backend architecture"
/>
<NavCard
href="/en-us/appendix/backend-languages"
title="Backend Programming Languages"
description="Compare the characteristics and applicable scenarios of mainstream backend languages to choose the best technology stack"
/>
<NavCard
href="/en-us/appendix/database-intro"
title="Database Principles"
description="Understand core database principles and master the art of data storage and retrieval"
/>
<NavCard
href="/en-us/appendix/cache-design"
title="System Cache Design"
description="Learn caching strategies to improve system high-concurrency processing capabilities"
/>
<NavCard
href="/en-us/appendix/queue-design"
title="Message Queue Design"
description="Master the key role of message queues in decoupling and peak shaving"
/>
<NavCard
href="/en-us/appendix/auth-design"
title="Authentication Principles & Practice"
description="Build secure identity authentication and permission management systems"
/>
<NavCard
href="/en-us/appendix/tracking-design"
title="Tracking Design"
description="Scientifically design data tracking to provide data support for product decisions"
/>
<NavCard
href="/en-us/appendix/operations"
title="Online Operations"
description="Master operations skills for system deployment, monitoring, and troubleshooting"
/>
</NavGrid>
### General Skills
Basic knowledge of software development:
<NavGrid>
<NavCard
href="/en-us/appendix/api-intro"
title="API Basics"
description="Basic knowledge of API interface design and development"
/>
<NavCard
href="/en-us/appendix/ide-intro/"
title="IDE Principles"
description="Understand the internal working mechanism of Integrated Development Environments (IDEs)"
/>
<NavCard
href="/en-us/appendix/terminal-intro"
title="Terminal Basics"
description="Master basic command-line terminal operations to improve development efficiency"
/>
<NavCard
href="/en-us/appendix/git-intro"
title="Git Detailed Introduction"
description="Deeply understand Git version control principles and advanced usage"
/>
<NavCard
href="/en-us/appendix/computer-networks"
title="Computer Networks"
description="Basic knowledge of network protocols and communication principles"
/>
<NavCard
href="/en-us/appendix/deployment"
title="Deployment & Launch"
description="Complete process and best practices for application deployment and release"
/>
</NavGrid>
## Usage Suggestions
- Use as reference material during the learning process, consult as needed
- When encountering unfamiliar technical concepts, look for explanations here first
- Recommended to read through once to establish a complete knowledge system
This is your technical knowledge treasure trove, always welcome to consult!
+1
View File
@@ -0,0 +1 @@
../../assets
+26
View File
@@ -0,0 +1,26 @@
---
layout: home
navbar: false
hero:
name: 'Easy-Vibe'
text: 'AI Coding Guide from Scratch'
tagline: 'A new coding paradigm for everyone. Whether you are a PM or a Full Stack Dev, find your AI coding path here.'
typingTagline:
- Coding, reimagined.
- Complexity, simplified.
- Every step, just right.
- Think it. Build it.
- Your pace. AI keeps up.
- From first character to complete system.
- Less friction. More creation.
- This is how coding should feel.
actions:
- theme: brand
text: Start Learning
link: /en/stage-0/
- theme: alt
text: Course Outline
link: /en/stage-0/
---
<HomeFeatures />
Binary file not shown.

After

Width:  |  Height:  |  Size: 374 KiB

+182
View File
@@ -0,0 +1,182 @@
:root {
/* 调整侧边栏分组之间的间距 */
--vp-sidebar-nav-section-gap: 8px;
--ev-doc-font-size: 13px;
--ev-doc-line-height: 1.65;
}
.vp-doc {
font-size: var(--ev-doc-font-size);
line-height: var(--ev-doc-line-height);
--el-font-size-extra-large: calc(var(--ev-doc-font-size) + 6px);
--el-font-size-large: calc(var(--ev-doc-font-size) + 4px);
--el-font-size-medium: calc(var(--ev-doc-font-size) + 2px);
--el-font-size-base: var(--ev-doc-font-size);
--el-font-size-small: calc(var(--ev-doc-font-size) - 1px);
--el-font-size-extra-small: calc(var(--ev-doc-font-size) - 2px);
--el-font-line-height-primary: var(--ev-doc-line-height);
}
.vp-doc p,
.vp-doc ul,
.vp-doc ol {
line-height: var(--ev-doc-line-height) !important;
}
.vp-doc :where(p, ul, ol, table, blockquote, pre, details, figure) {
margin: 10px 0;
}
.vp-doc blockquote {
font-size: 0.9em !important;
color: var(--vp-c-text-2);
}
.vp-doc blockquote p {
font-size: inherit !important;
line-height: 1.4 !important;
}
.vp-doc :where(li) {
margin: 4px 0;
}
.vp-doc :where(ul, ol) {
padding-left: 1.15em;
}
.vp-doc :where(h1, h2, h3, h4, h5, h6) {
line-height: 1.3;
}
.vp-doc :where(h1) {
margin: 22px 0 12px;
}
.vp-doc :where(h2) {
margin: 20px 0 10px;
}
.vp-doc h2 {
margin: 16px 0 8px !important;
padding-top: 10px !important;
border-top: 0 !important;
}
.vp-doc h2 .header-anchor {
top: 10px !important;
}
.vp-doc :where(h3) {
margin: 18px 0 8px;
}
.vp-doc :where(h4, h5, h6) {
margin: 16px 0 8px;
}
.vp-doc :where(hr) {
margin: 14px 0;
}
.vp-doc :where(th, td) {
padding: 6px 10px;
}
.vp-doc :where(:not(pre) > code) {
font-size: 0.95em;
}
/* 减少一级标题(如"前端开发")底部的间距 */
.VPSidebarItem.level-0 {
padding-bottom: 4px !important;
}
/* 减少一级标题文字与下方子菜单的间距 */
.VPSidebarItem.level-0 > .item {
padding-bottom: 2px !important;
}
/* 调整子菜单项之间的间距 - 针对所有层级 */
.VPSidebarItem.level-1 .item,
.VPSidebarItem.level-2 .item,
.VPSidebarItem.level-3 .item,
.VPSidebarItem.level-4 .item {
padding-top: 2px !important;
padding-bottom: 2px !important;
min-height: 24px !important; /* 强制减小最小高度 */
}
/* 针对可能存在的特定类名进行覆盖,确保紧凑 */
.VPSidebarGroup {
padding-top: 6px !important;
padding-bottom: 6px !important;
}
/* 进一步压缩分组标题与第一项之间的间距 */
.VPSidebarItem.level-0 + .VPSidebarItem.level-1 {
margin-top: -2px !important;
}
/* 压缩分组标题本身的行高 */
.VPSidebarItem.level-0 .text {
line-height: 1.3 !important;
}
/* 压缩子项的行高 */
.VPSidebarItem.level-1 .text,
.VPSidebarItem.level-2 .text,
.VPSidebarItem.level-3 .text {
line-height: 1.4 !important;
padding: 0 !important; /* 移除文字本身的内边距 */
}
/* 强制链接本身没有额外的边距 */
.VPSidebarItem .VPLink {
padding-top: 2px !important;
padding-bottom: 2px !important;
min-height: auto !important;
}
/* 图片高度限制策略:根据长宽比调整最大高度 */
/* 越高的图片(长宽比越大),限制的高度越小,避免占用过多纵向空间 */
.vp-doc img.img-tall {
max-height: 380px !important;
max-width: 100% !important;
width: auto !important;
height: auto !important;
}
.vp-doc img.img-very-tall {
max-height: 280px !important;
max-width: 100% !important;
width: auto !important;
height: auto !important;
}
.vp-doc img.img-ultra-tall {
max-height: 200px !important;
max-width: 100% !important;
width: auto !important;
height: auto !important;
}
.vp-doc img.img-limit-width {
max-width: 100% !important;
max-height: 320px !important;
width: auto !important;
height: auto !important;
}
.vp-doc img.img-limit-height {
max-height: 450px !important;
max-width: 100% !important;
width: auto !important;
height: auto !important;
}
/* Fix tagline wrapping issues */
.VPHomeHero .tagline {
white-space: nowrap;
max-width: none !important;
}
+259
View File
@@ -0,0 +1,259 @@
---
title: 'From Idea to AI Product - Easy-Vibe Learning Roadmap'
description: 'Complete roadmap for learning AI programming: from zero basics to full-stack development. Master AI IDE tools like Vibe Coding, Claude Code, and Cursor, and learn product thinking, full-stack development, and AI capability integration.'
---
# From Idea to AI Product
In the past, building software had a high barrier: you had to understand programming and algorithms and have years of project experience.
Now it's different. As long as you have an idea, AI can help you write the code.
This is a huge change: **Programming languages are becoming natural languages**.
The emergence of Large Language Models (LLMs) has turned development from a "technical expert's exclusive" into a tool everyone can use. What used to be the hardest part—"how to write code"—is now replaced by the new hardest part: "**What do you want to do?**"
> **What is Vibe Coding?**
> Simply put, it's "programming by speaking." Vibe coding means you can rely solely on conversing with AI instead of writing code directly to complete a programming project.
Of course, letting AI write code is just the first step. To make a truly usable product, you will still encounter these questions:
- How to let AI write clean, maintainable code?
- How to piece together scattered code into a runnable application?
- How to make the application truly go live and be used by people?
- How to put AI capabilities like text generation and image recognition into your product?
These questions will find answers in this course.
Whether you are a student, teacher, doctor, worker, or any common person who knows nothing about technology—you don't need to learn programming for years first; in two weeks, you can make a runnable, demonstrable product prototype.
| Your Identity | This Course Can Help You |
|---------|-------------|
| Student | Assignments, competitions, entrepreneurship; do projects yourself without asking for help |
| Professional | Automate repetitive work, improve efficiency, and even develop side hustles |
| Product Manager / Designer | Ideas no longer stay on paper; quickly make Demos to show bosses/clients |
| Entrepreneur / SME Owner | Validate ideas at low cost; make an MVP without spending tens of thousands on outsourcing |
| Teacher / Educator | Make teaching tools, courseware, and automated questions to improve teaching efficiency |
| Doctor / Lawyer / Professional | Automate professional processes and build your own efficiency tools |
| Anyone | Use AI to solve specific problems in life/work, making the impossible possible |
In the AI era, execution and ideas are always more important than technology.
## Growth Path: From "Using AI" to "Making AI Products"
<div class="stage-intro">
<div class="stage-card">
<div class="stage-icon">🎮</div>
<h3>Getting Started</h3>
<p class="stage-role">Experience AI Programming</p>
<div class="stage-tags">
<span>Snake Mini-game</span>
<span>Zero Basics to Start</span>
<span>Vibe Coding First Experience</span>
<span>Generate in Minutes</span>
</div>
</div>
</div>
<div class="stage-grid">
<div class="stage-card">
<div class="stage-icon">🛠️</div>
<h3>Stage One</h3>
<p class="stage-role">Product Manager / Operations</p>
<div class="stage-tags">
<span>AI IDE (Cursor/Claude)</span>
<span>Requirement Deconstruction & Prototype</span>
<span>Integrate AI Capabilities</span>
<span>Full Demo Development</span>
</div>
</div>
<div class="stage-card">
<div class="stage-icon">💻</div>
<h3>Stage Two</h3>
<p class="stage-role">Junior-Mid Developer / Indie Dev</p>
<div class="stage-tags">
<span>Figma to Code</span>
<span>Supabase Database</span>
<span>Stripe Payment Integration</span>
<span>Dify Knowledge Base</span>
</div>
</div>
<div class="stage-card">
<div class="stage-icon">🚀</div>
<h3>Stage Three</h3>
<p class="stage-role">Senior Developer / Architect</p>
<div class="stage-tags">
<span>Web/Mini-program/Multi-platform</span>
<span>MCP Advanced Tools</span>
<span>RAG & LangGraph</span>
<span>Senior Engineer Thinking</span>
</div>
</div>
</div>
<style>
.stage-intro {
margin: 20px auto;
max-width: 400px;
}
.stage-grid {
display: grid;
grid-template-columns: repeat(auto-fit, minmax(160px, 1fr));
gap: 12px;
margin: 16px 0;
}
.stage-card {
border: 1px solid var(--vp-c-divider);
border-radius: 10px;
padding: 12px;
background-color: var(--vp-c-bg-soft);
transition: all 0.3s ease;
display: flex;
flex-direction: column;
align-items: center;
text-align: center;
height: 100%;
}
.stage-card:hover {
transform: translateY(-2px);
background-color: var(--vp-c-bg-mute);
box-shadow: 0 4px 16px rgba(0, 0, 0, 0.05);
border-color: var(--vp-c-brand);
}
.stage-icon {
font-size: 2rem;
margin-bottom: 8px;
line-height: 1;
}
.stage-card h3 {
margin: 0 0 4px 0 !important;
font-size: 1rem;
font-weight: 600;
line-height: 1.2;
}
.stage-role {
margin: 0 0 8px 0 !important;
font-size: 0.8rem;
color: var(--vp-c-text-2);
font-weight: 500;
}
.stage-tags {
display: flex;
flex-wrap: wrap;
justify-content: center;
gap: 4px;
}
.stage-tags span {
font-size: 0.7rem;
padding: 1px 6px;
border-radius: 3px;
background-color: var(--vp-c-bg-alt);
color: var(--vp-c-text-2);
border: 1px solid var(--vp-c-divider);
}
.stage-card:hover .stage-tags span {
background-color: var(--vp-c-bg);
border-color: var(--vp-c-brand-dimm);
color: var(--vp-c-brand-dark);
}
</style>
Through this complete learning path, you will gain:
- **Vibe Coding Development Ability:** Effortlessly use vibe coding thinking and AI coding tools to increase development efficiency several times. No longer need to memorize syntax, but learn how to guide AI to generate high-quality code.
- **Full-stack Development Skills:** From UI design to front-end implementation, from database design to API development, and from local development to cloud deployment, master the full technology stack of modern Web applications.
- **AI Capability Integration:** Learn to call various multimodal AI APIs and seamlessly integrate AI capabilities like text, images, and voice into your applications, building intelligent products through technologies like RAG.
- **Product Thinking and Operations Ability:** From user research to demand deconstruction, from MVP design to product iteration, and from payment integration to user management, form a complete product development and operation closed loop.
# What Can You Do After Learning?
## Stage One: Build Your First Product Prototype
This stage is suitable for students with zero programming foundation or those who only know a little but are not confident. You don't need to learn a lot of theoretical knowledge first, but follow the steps directly and learn how to use AI tools to write code in the process.
**After learning, you can**:
- Independently complete a web application using AI programming tools
- Turn product ideas into clickable, interactive prototypes
- Add AI functions to the prototype (e.g., text-to-image, intelligent dialogue)
- Know how to troubleshoot and solve problems when encountering errors
Simply put, you can make something "runnable and demonstrable to others."
We can first experience AI programming through mini-games, then learn how to use AI programming tools to help you write code and fix errors. Then start from simple pages and gradually make interactive multi-page applications, adding AI functions like text-to-image and intelligent dialogue. Finally, independently complete a full project so that your creativity can truly have the possibility of landing.
# Why Use Project-Based Training?
> **Real-world Challenges**
>
> The reason is simple: based on the state of most students now, directly entering the workplace might make it difficult to move an inch under the "social beatings" of real projects and bosses/clients. More common scenarios in the real world are:
> Your mentor / boss: We want to do xxx, the goal is to achieve yyy effect.
>
> Documentation? Ready-made frameworks? Detailed requirement specifications? Often they don't exist.
Many tasks in real work are essentially solving problems never seen before in a highly uncertain environment: requirements are vague, boundaries are changing, no one tells you the standard answer, and you need to look up information, do experiments, build prototypes, iterate continuously, and finally give a "runnable, usable, and launchable" solution.
What this course wants to do is give you a "simulated social beating" in advance in a relatively safe environment:
- Force you to practice deconstructing problems, designing solutions, and finding information yourself through seemingly difficult project tasks.
- Allow you to learn to read, understand, and transform a medium-to-large codebase through scaffolds and code that are not so "idiot-proof."
- Let you experience the complete process of a real product from 0 to 1 through the complete closed loop from idea to launch.
In the short term, this kind of training is indeed torturous; but in the long term, it will greatly improve your competitiveness in job searching and career development: you will be more able to handle things, more able to find breakthroughs in uncertain environments, and more capable of turning AI into real landing products instead of staying in the "playing with demos" stage.
# The Art of Questioning: An Essential Skill in the AI Era
In the AI era, questioning is also a "basic skill." For the same code and the same error, **how you ask almost determines what kind of answer AI can give**: whether it's talking broadly or giving implementable modifications step by step.
**Develop Good Habits**: Treat "asking AI" as part of the daily development flow: ask immediately when you don't understand or get stuck.
## Why is this an Essential Skill?
- **Real life rarely has complete documentation**: Most of the time you face unclear requirements, half-finished code, and scattered error messages.
- **AI is your tutor + colleague by your side**: Those who can ask questions can turn it into "high-quality pair programming."
- **Ability upper limit is determined by communication**: The more you can provide key information and the more you can constrain the output format, the more usable the answer will be.
**Common Misconception**: Asking just "Why error?" usually only gets a bunch of guesses. Only by filling in the context will you get an executable solution.
## How to "Feed" Information to AI: Screenshots vs Copy-Paste
Both methods are fine, but for different purposes:
| Method | Applicable Scenarios | Key Requirements |
| ------------ | ----------------------------------------- | ----------------------------------------- |
| **Copy-Paste** | Error stacks, logs, code, configuration, API returns | Be as complete as possible; don't just take one line of keywords |
| **Screenshot** | UI layout issues, interaction anomalies, can't find buttons in tools | Screenshot full screen + highlight key areas, preferably with a line of text description |
::: danger ⚠️ Important Prerequisite
**Not all AI support image input.** Communication via screenshots requires AI to have multimodal capabilities (i.e., the ability to understand and analyze images). Current AIs that support image input include: Claude (Anthropic), GPT-4V/GPT-4o (OpenAI), Gemini (Google), and some Chinese models like Tongyi Qianwen, Wenxin Yiyan, etc.
**If the AI you are using does not support image input**, screenshots will not be recognized. In this case, please switch to copy-pasting text for communication.
:::
## Prompt Tips to Make AI "Explain Well"
If you don't just want the answer, but want to "learn" the answer. Using instructions like the following can significantly improve the quality of explanation:
> **Learning Question Examples**
>
> - "Please explain this concept clearly in 5 sentences first, and then ask me a few questions to verify if I understood it correctly."
> - "Please explain this error message in detail; I don't understand why the error occurred."
# I've been persistent for a long time but still can't handle it, I want to give up
Maybe your method of persistence is wrong. Don't hold on alone in the dark; you can come and talk to the authors and teaching assistants: frankly state the methods you have tried, the specific stuck points you encountered, and your current state of mind. Many times, just a slight adjustment in direction or adding a key knowledge point can keep you moving forward.
# I feel some designs of the tutorial are unreasonable
You are welcome to contact the author at any time, submit an issue, or give feedback directly in the group/class. We very much hope to work with you to polish this set of tutorials to be better and better: wherever it's unclear, wherever the experience is broad, or wherever it makes you waste effort, you can point it out frankly. The more real and specific the feedback, the more it can help newcomers avoid pitfalls.
# Reference
- [Nanjing University Computer Science and Technology Department Computer System Fundamentals Course Experiment](https://nju-projectn.github.io/ics-pa-gitbook/ics2025/)
@@ -0,0 +1,743 @@
# Primary 1: AI Era, If You Can Speak, You Can Code
This is a **project-based learning** tutorial. We encourage you to follow the steps one by one and try to reproduce the results.
Don't worry about making mistakes or modifying the content. We always believe you can do it. Please always remember:
<div style="text-align: center;">
<div style="display: inline-block; padding: 8px 20px; border-radius: 8px; border: 1px dashed #FFB6C1; background: linear-gradient(135deg, #FFF0F5 0%, #FFE4EC 100%); margin: 12px 0;">
<span style="font-size: 15px; font-weight: 500; color: #666;">Completion is more important than perfection 🐣</span>
</div>
</div>
<script setup>
const duration = 'Approx. <strong>4 hours</strong>, can be completed in multiple sessions'
</script>
## Chapter Outline
<ChapterIntroduction :duration="duration" :tags="['Conversational AI Programming', 'AI-Native Mini-Games', 'Snake Game Practice']" coreOutput="AI-Native Snake + Custom Mini-Game" expectedOutput="1 playable AI-native Snake game + (Optional) 1 custom AI-native mini-game or Demo of your choice">
If you <strong>don't know how to program at all</strong>, or only know the basics, this chapter is for you. We will start from the very beginning: using <strong>conversations</strong> to have AI write code for you, without needing to memorize syntax or set up environments. It will run right in your browser.
You will personally create <strong>your first running program</strong>—a Snake game that can "eat words, write poems, and draw". Through this practical exercise, you will experience what AI programming is really like: AI is not replacing your thinking, but rather, you speak your ideas, and AI helps you implement them.
All creation starts from 0 to 1. We are glad to pass each bit of confidence and professionalism to you. For you, <strong>execution is all you need</strong>.
</ChapterIntroduction>
<div style="margin: 50px 0;">
<ClientOnly>
<StepBar :active="0" :items="[
{ title: 'Dilemmas & Opportunities', description: 'New possibilities for coding' },
{ title: 'Capability Exploration', description: '60-second speed development' },
{ title: 'Native Practice', description: 'Build an AI-native Snake' },
{ title: 'Extended Creation', description: 'Create other games' }
]" />
</ClientOnly>
</div>
## 1. Dilemmas and Opportunities for Ordinary People
Many people have a bunch of product ideas in their heads: a small tool to help manage finances, a webpage to record a child's growth, or even a mini-game. But the thought of having to write code or find a programmer often discourages them directly.
After the emergence of AI, for the first time, ordinary people have a completely new possibility: you don't need to know how to write code, you just need to learn how to clearly tell AI what you want. [Data from GitHub Copilot](https://www.wearetenet.com/blog/github-copilot-usage-data-statistics) shows that over 15 million developers are using AI-assisted programming, with an average of 46% of code being AI-generated! In Java projects, this proportion can reach 61%.
<el-card shadow="hover" style="margin: 20px 0; border-radius: 12px;">
<template #header>
<div style="display: flex; align-items: center; gap: 8px;">
<span style="font-size: 20px;">🚀</span>
<span style="font-weight: bold; font-size: 16px;">Leaps in Efficiency and Adoption</span>
</div>
</template>
<el-row :gutter="20" style="margin-bottom: 24px;">
<el-col :span="6" :xs="12">
<div style="text-align: center; padding: 10px;">
<div style="color: #409EFF; font-size: 24px; font-weight: bold;">55%</div>
<div style="color: #909399; font-size: 12px; margin-top: 4px;">Speed Increase</div>
</div>
</el-col>
<el-col :span="6" :xs="12">
<div style="text-align: center; padding: 10px;">
<div style="color: #67C23A; font-size: 24px; font-weight: bold;">2.4 <span style="font-size: 14px;">Days</span></div>
<div style="color: #909399; font-size: 12px; margin-top: 4px;">Task Time (from 9.6)</div>
</div>
</el-col>
<el-col :span="6" :xs="12">
<div style="text-align: center; padding: 10px;">
<div style="color: #E6A23C; font-size: 24px; font-weight: bold;">81%</div>
<div style="color: #909399; font-size: 12px; margin-top: 4px;">Day-1 Install Rate</div>
</div>
</el-col>
<el-col :span="6" :xs="12">
<div style="text-align: center; padding: 10px;">
<div style="color: #F56C6C; font-size: 24px; font-weight: bold;">96%</div>
<div style="color: #909399; font-size: 12px; margin-top: 4px;">Suggestion Adoption</div>
</div>
</el-col>
</el-row>
<div style="line-height: 1.8; color: #606266;">
What is truly exciting is the leap in efficiency: developers' task completion speed increased by <b>55%</b>. Code that originally took 9.6 days to deliver can now be done in just <b>2.4 days</b>. This visible improvement shows that AI is no longer just an "optional feature" but is becoming an indispensable assistant in the development workflow. The adoption rate data confirms this: on the day they granted access, <b>81%</b> of developers installed and started using it immediately; among them, <b>96%</b> started adopting the AI's code suggestions that same day. In other words, developers almost instantly integrated AI into their daily coding routines.
</div>
</el-card>
For ordinary people, this trend is even more significant: if professional programmers are relying heavily on AI to write code, **why can't those of us who don't know how to program communicate directly with AI to realize our ideas**?
The goal of this course is to help you practice a new skill: building apps through natural language conversations. We will teach you how to communicate with AI using computer language and how to let AI turn the ideas in your head into real, usable products.
<div style="margin: 50px 0;">
<ClientOnly>
<StepBar :active="1" :items="[
{ title: 'Dilemmas & Opportunities', description: 'New possibilities' },
{ title: 'Capability Exploration', description: '60-second speed' },
{ title: 'Native Practice', description: 'Build AI-native Snake' },
{ title: 'Extended Creation', description: 'Create other games' }
]" />
</ClientOnly>
</div>
## 2. To What Extent Can AI Help You?
In this section, we only discuss one question: if you completely don't know how to write code, to what extent can today's AI help you?
Roughly speaking, you can understand current LLM capabilities as: competent in developing **simple internal tools**, **data visualization dashboards**, and some **lightweight mini-games**. These are generally sufficient for making **tools for personal use** or validating requirements from a **product manager's perspective**. But to generate a **commercially mature product** with one click, it still typically requires manual, continuous polishing of **process design** and **details**.
Next, let's take Snake as an example and see exactly what AI programming can achieve.
### 2.1 Build a Snake Game in 60 Seconds
First, please open the experimental site used in the course, [z.ai](https://chat.z.ai/). `z.ai` is an AI platform developed by Zhipu AI (one of China's leading LLM companies), powered by their proprietary GLM models. This platform includes various features, such as slideshow generation, poster design, and full-stack development. In this tutorial, we will focus on its full-stack development module.
::: details 💡 What is the "programming right on the web" paradigm?
In the past, developing a web app required:
- Installing programming environments (Node.js, Python, etc.)
- Configuring code editors
- Learning HTML/CSS/JavaScript
- Dealing with dependencies and errors
Now, with AI coding platforms, you only need to:
- Open your browser and visit the site
- Describe your desired features in natural language
- Have AI instantly generate the code and let you preview the result live
This "conversation as programming" paradigm changes coding from "writing instructions" to "describing requirements". You don't need to care about low-level technical details; just clearly state what you want. This is the new programming paradigm of the AI era—**Vibe Coding**.
:::
![](../../../zh-cn/stage-0/0.2-ai-capabilities-through-games/images/index-2026-01-07-18-25-03.png)
Input our simple requirement and click the **Full-stack Development** button. You can watch the webpage being built in real time. Usually, it takes just the time to brew a coffee!
```
Help me create a Snake game:
1. Control snake movement with arrow keys
2. When it eats food, it gets longer and the score increases
3. Hitting walls or itself results in Game Over
4. Include Start and Restart buttons
5. The UI should be clean and elegant
```
![](../../../zh-cn/stage-0/0.2-ai-capabilities-through-games/images/index-2026-01-07-18-34-03.png)
Once generated, you will see a browsable webpage UI on the right. Scroll around or click the 🧭 button at the top to view it in full screen.
> The buttons at the top from left to right are: Arrow button expands chat history, Pencil button to start a new chat, Refresh icon to rebuild the page, Compass icon to toggle fullscreen, Download button to download the project, <> button to view code, and Publish button to publish it.
![](../../../zh-cn/stage-0/0.2-ai-capabilities-through-games/images/index-2026-01-07-18-35-11.png)
If you'd like to check the webpage's source code, click the code icon in the top right to view the entire codebase.
![](../../../zh-cn/stage-0/0.2-ai-capabilities-through-games/images/image7.png)
::: tip 🌐 Explore More AI Programming Tools
Besides z.ai, we also recommend trying out these excellent AI programming platforms:
| Tool | Link | Features |
|------|------|----------|
| **Google AI Studio** (Recommended)| [aistudio.google.com/apps](https://aistudio.google.com/apps) | Official tool from Google, powered by Gemini, great for rapid prototyping |
| **Figma Make** | [figma.com/make](https://www.figma.com/make) | Deeply integrated with design tools, ideal for interactive prototypes |
| **Coze** | [coze.com](https://www.coze.cn) | AI bot platform by ByteDance, zero-code visual building |
| **v0.dev** | [v0.dev](https://v0.dev) | AI generation for React components from Vercel |
| **Bolt.new** | [bolt.new](https://bolt.new) | AI full-stack development capable of generating deployed apps |
| **Lovable** | [lovable.dev](https://lovable.dev) | High-quality React app generation |
| **Replit Agent**| [replit.com](https://replit.com) | Online IDE integrated with AI |
For more comparisons, view the appendix: [Comparison of 7 AI Programming Tools](../../stage-1/appendix-articles/example0-1/vibe-coding-tools-snake-game-tutorial.md)
:::
### 2.2 What Conversational Programming Can and Cannot Do
This section focuses on a specific question: When relying exclusively on conversational AI and writing no code at all, how far can you push a project?
In terms of experience, a fairly consistent conclusion is: It can help you complete a "small but complete" thing, but determining "how much is enough" still requires your personal decision on every detailed step.
#### Excels at "Small and Clear" Apps
From the Snake game example, you already saw a typical pattern:
As long as you can clearly describe the UI and interaction, AI can often piece together a fully functional, clickable webpage in just a few rounds of conversations.
Such tasks often share a few characteristics:
- Clear scope: one page, a simple internal tool, a small game mechanic.
- Visible results: you immediately see if it works as expected.
- Direct debugging: you can point out errors and ask for corrections easily.
Within these boundaries, you can view the AI as a highly capable "junior assistant".
**AI's success rate in handling small-scale tasks:**
<el-progress :percentage="90" :stroke-width="15" status="success" striped striped-flow />
#### Large Projects Require a "Process Perspective"
Once it exceeds the small and clear scope, relying purely on conversational requests to build complex systems end-to-end will quickly hit ceilings. Large projects deal with backend databases, third-party services, authentication, permissions, edge cases, state management, etc.
In these situations, the logical approach is to define a clear process flowchart and break it into segments to be handled individually.
#### The Difference Between Generating and Validating
Just because AI wrote it doesn't mean it's ready for a commercial launch! Always validate AI-generated code, especially in secure systems.
::: warning ⚠️ Usage Guidelines
- **Prototypes/Tools/Demos**: Highly suitable for early stage builds iterations.
- **Large consumer-facing products**: Usually needs developers for architecture.
- **High-security systems**: Not suitable to deploy immediately. Needs stringent checks.
:::
<div style="margin: 50px 0;">
<ClientOnly>
<StepBar :active="2" :items="[
{ title: 'Dilemmas', description: 'New possibilities' },
{ title: 'Basic Ability', description: '60-second speed' },
{ title: 'Native Practice', description: 'Build AI-native Snake' },
{ title: 'Extended', description: 'Create other games' }
]" />
</ClientOnly>
</div>
## 3. Hands-on: Your First AI Native Application
Let's do some hands-on work. We'll add some native AI integration elements into our game.
### 3.1 AI-Native Snake
You can simply provide these prompts:
> **💡 Example Prompt:** Build me a Snake game.
>
> ![](../../../zh-cn/stage-0/0.2-ai-capabilities-through-games/images/image12.png)
> **💡 Example Prompt:** Build me a Snake game that supports:
> 1. Eating different words and placing them in a collection box.
>
> ![](../../../zh-cn/stage-0/0.2-ai-capabilities-through-games/images/image13.png)
> **💡 Example Prompt:** Build a Snake game that supports:
> 1. I can eat distinct words, collected in a box.
> 2. When eating 8 words, the LLM generates a poem using them.
> 3. An image generation API is called right after the poem is composed.
>
> ![](../../../zh-cn/stage-0/0.2-ai-capabilities-through-games/images/image14.png)
If you face any issues, just screenshot the error or tell the bot what's wrong and it will iterate the changes.
![](../../../zh-cn/stage-0/0.2-ai-capabilities-through-games/images/image15.png)
### 3.2 Add New Features to the Game
After completing the basic functionality, we can try adding some new twists to our program! If you find the process of the snake eating words or characters a bit boring, you can have the snake eat words of different colors and change the snake's color accordingly.
You can also add special effects to the "eating" process, or introduce magic words that trigger special effects—like increasing the snake's speed or size. Another idea is to have the model generate a poem and an image every time the snake eats a word, instead of waiting until it eats eight.
If these feel challenging, you can ask the language model directly for help! It can provide creative suggestions to make your game more fun. Give it a try!
```
1. "Word Unlocks World" Mechanic
Every time the snake eats a word, the LLM performs a poetic association on that word (e.g., "tree" → "forest", "shade"), and the image model instantly generates a small artwork for that word. These images gradually piece together into a unique, player-created panorama, so players are "painting and writing poetry" with every playthrough.
2. "Poetry Puzzle" Gameplay
Each word the snake eats triggers the LLM to generate a short verse, and the image model generates an illustration. These verses and images combine like puzzle pieces, forming an AI-collaborative poem and painting at the end of the round.
3. "Magic Words" & "Story Branches"
Special "magic words" (e.g., "wind", "night", "dream") not only trigger the LLM to generate poetry but also change the mood or theme of the scene—transforming the generated image style to nighttime, stormy, or dreamlike atmospheres.
Branching story: The LLM gives a theme or riddle at the start (e.g., "autumn memories"). The player's word choices directly influence the story and poetry evolution, with the image model updating backgrounds and visuals in real time.
4. "Real-time Interactive Generation"
After each word, the LLM generates a line of dialogue or description; NPCs in the game can "speak" to the player, or the environment can change accordingly.
The snake's appearance or obstacles in the game can visually change based on the words eaten, thanks to the image model.
5. "Create & Share"
Players can save and share their AI-created poems and images at the end of a session, showing off their unique "AI collaboration."
Leaderboards for "Most Beautiful Poem + Art", "Most Creative Word Combination", etc., encourage replaying and creativity.
6. "Sentence Snake" Challenge
Reverse mode: The LLM gives a line of poetry or a riddle, and the player must guide the snake to eat words in order to reconstruct the sentence. Eating the wrong word triggers funny or artistic consequences via the image generation model.
7. "Themed Levels" & "Style Selection"
At the start of the game, the player chooses a theme (e.g., "fairy tale", "sci-fi", "Tang poetry"), and both the LLM and image model adjust word selection, poetry style, and visuals to match, making each run feel fresh.
8. "Live Co-creation"
When a special word is eaten, the LLM can prompt the player to input a phrase or choose a style, then AI generates corresponding verses and illustrations, making it a true human-AI co-creation.
9. "AI Easter Eggs & Achievements"
Certain word combinations are recognized by the LLM as special themes or inside jokes (e.g., "moon", "osmanthus", "riverbank"), triggering rare verses and illustrations that reward exploration.
10. "A Growing Story"
As the snake grows, the LLM generates a continuous story-poem, and the image model creates a seamless scroll or panorama, so the player is simultaneously "writing, painting, and playing."
```
Additionally, we can also ask the LLM to generate project-level prompts for you directly. In the previous section, we only wrote the Snake game prompt ourselves. Now let's try having the LLM generate a prompt with an overall framework and implementation path (you can generate it directly with z.ai).
If you want to learn how to write better prompts, check out the [Prompt Engineering Appendix](/zh-cn/appendix/8-artificial-intelligence/prompt-engineering).
> I want AI to generate a web-based Snake game and need a more complete prompt to make the result more impressive and fun. Please generate the corresponding prompt. The current goal is: generate a Snake game that implements the function of eating different words to generate poetry, and should include an image generation module.
z.ai's response will look like this:
![](../../../zh-cn/stage-0/0.2-ai-capabilities-through-games/images/image56.png)
We can use this prompt to regenerate the project in full-stack development mode:
![](../../../zh-cn/stage-0/0.2-ai-capabilities-through-games/images/image57.png)
![](../../../zh-cn/stage-0/0.2-ai-capabilities-through-games/images/image58.png)
<div style="margin: 50px 0;">
<ClientOnly>
<StepBar :active="3" :items="[
{ title: 'Dilemmas', description: 'New possibilities' },
{ title: 'Basic Ability', description: '60-second speed' },
{ title: 'Native Practice', description: 'Build AI-native Snake' },
{ title: 'Extended', description: 'Create other games' }
]" />
</ClientOnly>
</div>
### 3.3 Try Making Other Mini-Games
Beyond Snake, we can let our imagination run wild.
Create anything we want to create, and even try to mess everything up! Then start over!
```
1. AI Art Gallery Platform
Description: An online gallery showcasing AI-generated artworks where users can upload, share, and comment on AI art.
Features: User account system, artwork upload and display, rating system, category browsing, AI generation tool integration.
Tech highlights: React/Vue frontend, Node.js backend, MongoDB database, AI API integration.
2. Retro Game Archive
Description: A website paying tribute to classic games, featuring game history, gameplay guides, and playable retro games online.
Features: Game database, timeline display, online emulator, user reviews, game collection feature.
Tech highlights: Responsive design, WebGL/Canvas game implementation, RESTful API, user authentication.
3. Sustainable Living Tracker
Description: A website helping users track and reduce their carbon footprint through eco-tips and community challenges.
Features: Personal carbon footprint calculator, goal setting, progress tracking, community challenges, eco knowledge base.
Tech highlights: Data visualization, mobile optimization, social features, push notifications.
4. Virtual Kitchen Assistant
Description: An AI-based cooking guidance platform providing personalized recipe recommendations and step-by-step cooking instructions.
Features: Recipe database, ingredient recognition, personalized recommendations, cooking timer, nutrition analysis.
Tech highlights: Image recognition API, ML recommendation system, voice control, real-time video guidance.
5. Underground Music Discovery Platform
Description: A music streaming platform focused on indie and emerging artists, offering a unique discovery experience.
Features: Music streaming, artist profiles, personalized recommendations, playlist creation, community reviews.
Tech highlights: Audio streaming, recommendation algorithms, social features, music visualization.
6. Minimalist Task Management System
Description: A task management tool with zen aesthetics, focused on simple and efficient task organization.
Features: Task creation and categorization, priority setting, progress tracking, team collaboration, data analytics.
Tech highlights: Minimalist UI design, drag-and-drop, real-time sync, cross-platform compatibility.
7. Sci-Fi Writing Workshop
Description: A platform providing creative tools and inspiration for sci-fi writers, including world-building aids and character development tools.
Features: Story structure tools, character profiles, world-building templates, writing statistics, community feedback.
Tech highlights: Rich text editor, data visualization, collaborative editing, AI-assisted creation.
8. Personal Knowledge Graph
Description: A tool helping users build personal knowledge networks, visualizing and connecting various ideas and information.
Features: Node creation and connection, tagging system, search functionality, import/export tools, visual charts.
Tech highlights: Graph database, data visualization algorithms, Markdown support, cross-device sync.
9. Virtual Botanical Garden
Description: An interactive plant encyclopedia where users can explore the plant world and create virtual gardens.
Features: Plant database, 3D plant models, growth simulation, gardening guides, community showcase.
Tech highlights: 3D rendering, seasonal change simulation, AR integration, plant recognition API.
10. Programming Challenge Arena
Description: An online competition platform for programmers with coding challenges of various difficulty levels.
Features: Challenge problems, code editor, auto-evaluation, leaderboards, learning paths.
Tech highlights: Code sandbox environment, real-time evaluation system, algorithm visualization, social learning features.
```
And... if you enjoy playing games, let's try creating games together!
```
1. 3D Open World RPG
Description: A fantasy RPG with a vast open world, quests, and character progression.
Features: Day-night cycle, dynamic weather, skill trees, multiplayer co-op, crafting system.
Tech highlights: Three.js or Babylon.js for 3D rendering, server-side game logic, character customization, save system.
2. First-Person Shooter (FPS) Arena
Description: A fast-paced multiplayer FPS with various game modes and maps.
Features: Team deathmatch, capture the flag, weapon customization, ranked matches.
Tech highlights: WebGL/Three.js for 3D graphics, multiplayer netcode, hit detection, voice chat.
3. AI Chess and Multiplayer
Description: A full-featured chess platform with AI opponents and online matches.
Features: AI difficulty levels, endgame challenges, tournament mode, replay analysis.
Tech highlights: Chess logic library, WebSocket for real-time matches, ELO ranking system, anti-cheat.
4. Mahjong Online Multiplayer
Description: A traditional Mahjong game with online multiplayer and scoring.
Features: Multiple rule sets, private rooms, ranking system, replay feature.
Tech highlights: Tile matching logic, real-time multiplayer, lobby system, score tracking.
5. Turn-Based Strategy Game
Description: A tactical strategy game with grid-based combat and unit management.
Features: Campaign mode, skirmish, unit upgrades, fog of war, multiplayer battles.
Tech highlights: Grid movement system, AI decision-making, turn synchronization, save/load system.
6. Time Trial Racing Game
Description: A 3D racing game focused on time trials and track records.
Features: Multiple tracks, car customization, ghost replays, leaderboards.
Tech highlights: 3D car physics, track editor, replay system, online leaderboards.
7. Card Battle Game (Deck Building)
Description: A strategic card game where players build decks and battle opponents.
Features: Card collection, deck building, ranked matches, seasonal events.
Tech highlights: Card game logic, matchmaking system, AI opponents, card animations.
8. Battle Royale (Top-Down 2D)
Description: A top-down 2D battle royale with shrinking play zones and loot mechanics.
Features: Solo and squad modes, weapon variety, in-match events, leaderboards.
Tech highlights: Real-time multiplayer, zone shrinking logic, loot generation system, matchmaking.
9. Horror Survival Game (First-Person)
Description: A first-person horror game with resource management and escape mechanics.
Features: Atmospheric environments, puzzles, enemy AI, multiple endings.
Tech highlights: Dynamic lighting, sound design, enemy pathfinding, save system.
10. Music Rhythm Game (3D)
Description: A 3D rhythm game where players hit notes to the beat of the music.
Features: Multiple difficulty levels, track editor, custom song support, leaderboards.
Tech highlights: Audio analysis, beat synchronization, 3D note tracks, input timing detection.
```
## 📚 Assignment
<el-card id="assignment-card" shadow="hover" style="margin: 20px 0; border-radius: 12px;">
<template #header>
<div style="font-weight: bold; font-size: 16px;">🎯 Chapter Assignment: Build Your First AI-Native Mini-Games</div>
</template>
<p>
In this section, you've followed the steps to experience the complete process from "conversational Snake generation" to "understanding AI-native game design thinking." The following assignments will help you turn this understanding into real skills.
</p>
<ol>
<li>
<strong>Fully Reproduce the AI-Native Snake Game</strong>
<ul>
<li>At minimum, implement: the snake can move, eating "food" changes its length and score, and hitting walls or itself ends the game.</li>
<li>During reproduction, practice sending the error description + error message + key code snippets all at once to the AI, asking it to fix things in "beginner mode."</li>
</ul>
</li>
<li>
<strong>(Optional) Create 1 Original AI-Native Mini-Game or Demo</strong>
<ul>
<li>It can be any lightweight gameplay involving text, images, music, rhythm, etc., such as "eat words to write poems," "rhythm clicking," "generative runner," etc.</li>
<li>The focus isn't on flashy graphics, but on being able to clearly articulate: what specifically did AI help with here, and what "hard-to-do-manually or tedious" part did it solve.</li>
</ul>
</li>
</ol>
<p>
That's the complete tutorial! You may need about <strong>4 hours</strong> to finish all the content and build your own Snake game. Don't rush—explore, experiment, and enjoy the process. If you encounter concepts you don't quite understand along the way, we recommend checking the relevant sections in the appendix below.
</p>
</el-card>
## Appendix
<el-card id="appendix-nav" shadow="hover" style="margin-top: 24px; margin-bottom: 24px; border-left: 5px solid #67C23A;">
<div style="font-weight: bold; margin-bottom: 8px;">Appendix Navigation</div>
<div style="color: #606266; font-size: 14px; line-height: 1.6; margin-bottom: 12px;">
Here we've compiled some foundational concepts related to this chapter: if you encounter questions like "what is frontend?" or "what exactly does Vibe Coding mean?" during your learning, you can always come back here to look them up.
</div>
<el-row :gutter="16">
<el-col :span="12">
<a href="#appendix-1" style="text-decoration: none; color: inherit;"><b>Appendix 1: Do We Need Frontend Knowledge?</b></a><br/>
<span style="font-size: 12px; color: #909399">Understand where frontend fits in the overall application, and know which parts are "visible."</span>
</el-col>
<el-col :span="12">
<a href="#appendix-2" style="text-decoration: none; color: inherit;"><b>Appendix 2: What Exactly is Vibe Coding</b></a><br/>
<span style="font-size: 12px; color: #909399">Understand the core idea of "conversational development" and how to collaborate with AI.</span>
</el-col>
</el-row>
<el-row :gutter="16" style="margin-top: 10px;">
<el-col :span="12">
<a href="#appendix-3" style="text-decoration: none; color: inherit;"><b>Appendix 3: Model Context</b></a><br/>
<span style="font-size: 12px; color: #909399">Understand commonly heard but easily confused concepts like "context length."</span>
</el-col>
<el-col :span="12">
<a href="#appendix-4" style="text-decoration: none; color: inherit;"><b>Appendix 4: Instruction Following</b></a><br/>
<span style="font-size: 12px; color: #909399">Learn why models sometimes "don't understand" and how to write clearer instructions.</span>
</el-col>
</el-row>
<div style="margin-top: 12px; font-size: 12px; color: #909399;">
Tip: You can press Ctrl/⌘+F to search for keywords, or copy confusing paragraphs to AI and ask it to explain again in a way "a complete beginner can understand."
</div>
</el-card>
## <span id="appendix-1">[Appendix 1: Do We Need Frontend Knowledge?](#appendix-nav)</span>
::: tip 💡 One-line Summary
You don't need to write code, but understanding the basic concepts helps you describe requirements to AI more effectively.
:::
<el-row :gutter="16" style="margin: 20px 0;">
<el-col :span="12" :xs="24" style="margin-bottom: 16px;">
<el-card shadow="hover" style="border-radius: 12px; height: 100%;">
<template #header>
<div style="display: flex; align-items: center; gap: 8px;">
<span style="font-size: 20px;">👁️</span>
<span style="font-weight: bold;">Frontend</span>
<el-tag type="success" size="small">Visible</el-tag>
</div>
</template>
<div style="color: #606266; line-height: 1.8;">
Everything users can <strong>see and click</strong>
<ul style="margin: 12px 0; padding-left: 20px;">
<li>Page titles, text, images</li>
<li>Buttons, input fields, dropdown menus</li>
<li>Game interfaces, animation effects</li>
</ul>
</div>
</el-card>
</el-col>
<el-col :span="12" :xs="24" style="margin-bottom: 16px;">
<el-card shadow="hover" style="border-radius: 12px; height: 100%;">
<template #header>
<div style="display: flex; align-items: center; gap: 8px;">
<span style="font-size: 20px;">⚙️</span>
<span style="font-weight: bold;">Backend</span>
<el-tag type="info" size="small">Invisible</el-tag>
</div>
</template>
<div style="color: #606266; line-height: 1.8;">
Data processing running on the server
<ul style="margin: 12px 0; padding-left: 20px;">
<li>User score storage</li>
<li>Login account verification</li>
<li>Level content distribution</li>
</ul>
</div>
</el-card>
</el-col>
</el-row>
### The Frontend Trio
Browsers use three types of "code" to build pages:
<el-tabs type="border-card" style="margin: 20px 0;">
<el-tab-pane label="🏗️ HTML - Skeleton">
<div style="padding: 10px;">
<p><strong>Purpose:</strong> Defines <strong>what elements</strong> are on the page</p>
<p><strong>Analogy:</strong> The structural blueprint of a house (where walls, doors, and windows go)</p>
<el-card style="background: #f5f7fa; margin-top: 12px;">
<pre style="margin: 0;"><code>&lt;button&gt;Click me&lt;/button&gt;
&lt;h1&gt;Title&lt;/h1&gt;
&lt;img src="photo.png"&gt;</code></pre>
</el-card>
</div>
</el-tab-pane>
<el-tab-pane label="🎨 CSS - Style">
<div style="padding: 10px;">
<p><strong>Purpose:</strong> Controls <strong>how elements look</strong></p>
<p><strong>Analogy:</strong> The interior decoration of a house (colors, materials, layout)</p>
<el-card style="background: #f5f7fa; margin-top: 12px;">
<pre style="margin: 0;"><code>button {
background: blue;
color: white;
border-radius: 8px;
}</code></pre>
</el-card>
</div>
</el-tab-pane>
<el-tab-pane label="⚡ JavaScript - Behavior">
<div style="padding: 10px;">
<p><strong>Purpose:</strong> Makes the page <strong>interactive</strong></p>
<p><strong>Analogy:</strong> The electrical switches of a house (responses after clicking)</p>
<el-card style="background: #f5f7fa; margin-top: 12px;">
<pre style="margin: 0;"><code>button.onclick = () => {
alert('You clicked me!')
}</code></pre>
</el-card>
</div>
</el-tab-pane>
</el-tabs>
### How Does Code Become a Page?
When you open a webpage, the browser processes three types of code in order:
**1. HTML — Defines the page structure**
The browser first parses HTML to understand what elements are on the page (headings, paragraphs, images, buttons, etc.) and their hierarchical relationships.
**2. CSS — Applies styles**
Then the browser applies CSS rules to add styles to these elements: colors, sizes, positions, spacing, etc., making the page look beautiful.
**3. JavaScript — Adds interactivity**
Finally, JavaScript code is executed to make the page "come alive": responding to clicks, submitting forms, playing animations, etc.
**4. Page rendering**
The combined result of all three is the webpage you ultimately see.
### Modern Frontend Frameworks: From HTML to React/Vue
The HTML, CSS, and JavaScript introduced above are the "three essentials" of frontend development—they are the foundation of all webpages. But when pages become complex, developing directly with these three can be challenging: code becomes hard to maintain, there's lots of repetitive work, and data synchronization is troublesome.
**Modern frontend frameworks** (like React, Vue, Angular) are built on top of HTML/CSS/JS to make development more efficient:
**1. HTML/CSS/JS (Basic stage)**
Directly manipulating page elements, suitable for simple pages. But as code grows, all logic gets mixed together and becomes hard to maintain.
**2. jQuery (Transitional stage)**
Simplified DOM operations, making code more concise. But you still need to manually manage page state and find corresponding elements to update when data changes.
**3. React/Vue (Modern stage)**
Adopts component-based and state-driven design:
- **Component-based**: Break the page into independent, reusable modules (like buttons, cards, navigation bars)
- **State-driven**: When data changes, the framework automatically updates the corresponding UI without manual manipulation
::: tip 💡 Simple Understanding
- **HTML/CSS/JS** = Basic materials (bricks, cement, steel)
- **React/Vue** = Building framework (provides standards and tools for constructing buildings)
In the AI-assisted programming era, you don't need to deeply master every detail of frameworks. You just need to understand their basic concepts, and you can describe requirements in natural language to have AI generate code for you.
:::
### In Vibe Coding
**Core point: You don't need to write code, you just need to know how to describe.**
After understanding frontend concepts, you can describe requirements to AI like this:
> "Use React to make a leaderboard page, with a score list on the right side. Clicking a row shows player details below. The style should be clean and modern."
If you want to dive deeper into frontend fundamentals like HTML, CSS, and JavaScript, check out the [Web Basics Appendix](/zh-cn/appendix/3-browser-and-frontend/javascript-deep-dive). To learn about the evolution of frontend technology, check out the [Frontend Evolution Appendix](/zh-cn/appendix/3-browser-and-frontend/frontend-frameworks).
## <span id="appendix-2">[Appendix 2: What Exactly is Vibe Coding](#appendix-nav)</span>
> 💡 What is Vibe Coding? Computer scientist [Andrej Karpathy](https://karpathy.ai/) (one of the co-founders of OpenAI, former head of AI at Tesla) coined the term **vibe coding** in February 2025. This concept refers to a coding methodology that relies on LLMs, **allowing programmers to generate working code by providing natural language descriptions instead of manually writing code.**
![1767350588191](../../../zh-cn/stage-0/0.2-ai-capabilities-through-games/images/1767350588191.png)
Literally, Vibe Coding can be understood as a way of "developing by talking." The core change is: you no longer need to write code line by line, look up syntax, or debug yourself. Instead, you directly describe what you want in natural language, for example:
"I need a login page with a phone number input field and a verification code input field."
"After successful login, redirect to the homepage and display the username in the top right corner."
"Give me a simple Snake game that can be controlled with keyboard arrow keys."
The Large Language Model (LLM) will automatically translate these descriptions into real, runnable code and generate the corresponding pages, logic, and data structures. After you see the results, you can propose modifications in natural language, such as "make the button bigger," "change the background to dark," "record scores and display a leaderboard," and the AI will continue adjusting the implementation according to your requirements.
In this mode, you don't need to learn a programming language first before writing code. Instead, you focus your main energy on: clearly stating what you want to do, judging "what's wrong" after seeing the results, and then proposing new modifications. AI handles turning these high-level ideas into concrete implementations, significantly reducing mechanical, repetitive coding work.
You can click here to learn more about vibe coding: [https://www.ibm.com/think/topics/vibe-coding](https://www.ibm.com/think/topics/vibe-coding)
You can click here to see more of Karpathy's shared content: [https://karpathy.bearblog.dev/blog/](https://karpathy.bearblog.dev/blog/)
### How to Pretend You're a Vibe Coding Master
In practice, during real vibe coding, we usually don't use many complex prompts. Perhaps we need a specific and moderately complex prompt for the entire program at the beginning, but after that, at each step, you may only need prompts like these:
```
"There's a bug in the code, please fix it."
"I don't want partial code, give me the complete modified code."
"Your code still has problems."
"Please modify again and give me the complete corrected code."
"It was working before, why isn't it working now?"
"Did you not understand what I meant? Don't change my original code."
"Don't add any debugging features."
"Don't do things I didn't ask you to do."
"Where is the feature I asked you to implement?"
"Can you not understand what I'm saying?"
"I only want one function."
"I told you to refer to my previous code."
"Please don't add unnecessary comments."
"Please don't modify the basic logic of my original code."
"Help me modify the code."
"Modify based on my code..."
"Don't change my variable names!!!"
"Don't change the original function names!"
"Don't mess with my variables."
"Don't add extra features."
"Don't just generate a skeleton, generate the complete code."
```
This may sound a bit exaggerated, but in reality, these are the prompts we might use in daily work. Due to the **context length limitations** of large language models, or sometimes because their **instruction following ability** isn't very strong, models may forget content discussed earlier in the conversation. In vibe coding, we tend to use models with long context and strong instruction following ability. We can judge whether a model is good through rankings or metrics of these two aspects.
Alternatively, due to the style of training datasets, large models tend to respond in the style of their training data. For example, some speak very seriously, some like to add lots of embellishments, and some models like to add lots of comments or unnecessary modules to code.
## <span id="appendix-3">[Appendix 3: Model Context](#appendix-nav)</span>
Model context can be understood as AI's short-term memory. It refers to all the text content that the model can "see" and "remember" during a single conversation or task, including your previous questions, system-provided instructions, relevant materials, etc.
It is precisely because of context that AI can understand you're continuing from previous content, enabling round after round of coherent, natural conversation. Without context, every sentence you say would appear to the model as a completely new question—it wouldn't know what you said before, and there would be no way to continue a conversation.
Each model has its own effective context length (context window). This length is usually measured in tokens (which can be roughly understood as units of "word fragments"), and most mainstream models currently range from 32k to 128k tokens. The longer the context, the more content the model can "read" at once, for example:
- Reading an entire lengthy paper or report in one go
- Referencing multiple materials and cases in the same conversation
- Having the model remember conclusions from complex discussions several rounds ago
When your input approaches or exceeds the model's context limit, some common phenomena often appear:
- The model starts forgetting details or key information from earlier in the long text
- As the conversation progresses, the topic gradually drifts from the original goal
- Across different Q&As about the same material, the referenced content becomes inconsistent
These phenomena don't mean the model suddenly "got dumber"—they are natural results of the context capacity being used up or nearly used up.
In practical use, we want the context to be as long as possible, while also being aware that:
- The longer the context, the more computing resources it consumes
- The corresponding API costs (fees) also increase accordingly
Therefore, when designing AI applications, you need to balance letting the model see enough information with controlling costs and improving efficiency. For example:
- Distill information that truly needs long-term retention before feeding it to the model
- Avoid stuffing detail information that's no longer needed into the context repeatedly
- Use external knowledge bases and similar approaches to hand "long-term memory" to the system rather than forcing it into the model's context
## <span id="appendix-4">[Appendix 4: Instruction Following](#appendix-nav)</span>
Instruction following refers to: after the model understands your instructions, whether it can accurately and completely execute according to your requirements. This includes not only answering questions, but also completing tasks in specified formats, styles, and steps.
For example, the following are all instructions with clear requirements for the model:
- Summarize this article into three key points
- Write a reply email in a formal, polite tone
- Translate this word into English and create an example sentence for each
- Extract the author, time, and main events from the article
A model with strong instruction following ability typically has these characteristics:
- Outputs content in the required quantity
For example, if asked to summarize three key points, it won't give five.
- Covers all specified elements
For example, if asked to extract author, time, and events, it won't omit any of them.
- Follows the specified format and tone
For example, if asked to use a formal tone, it won't output overly colloquial responses.
- Doesn't make unnecessary additional extensions
For example, if only asked to translate and create sentences, it won't output a large paragraph of unrelated explanations.
In practical applications, strong instruction following ability is very important for these reasons:
- Improved stability: The same instruction produces more consistent output structure and behavior patterns across different times and multiple runs, less likely to go off-script
- Improved reproducibility: When you configure a prompt into a product or workflow, you can predict roughly how the model will respond, making testing and iteration easier
- Easier system integration: When model output conforms to expected formats, it's easier to automatically interface with backend programs, workflows, or other tools
Therefore, when selecting and evaluating a large language model, in addition to focusing on whether it's smart and has broad knowledge coverage, you also need to pay special attention to its instruction following ability. For industrial-grade applications, being able to stably and accurately execute instructions is often more important than occasionally giving a stunning answer.
+129
View File
@@ -0,0 +1,129 @@
# Build Your First AI Product
Welcome to the **Build Your First AI Product** stage! This is the starting point of the Easy-Vibe tutorial, specifically designed for learners with zero programming background.
## What You Will Learn
In this stage, you will start from zero and master the Vibe Coding workflow, becoming a super individual capable of independent product design.
### Getting Started
Suitable for learners with product, operations, and non-technical backgrounds. Understand AI programming logic through games and build learning confidence:
<NavGrid>
<NavCard
href="/en/stage-0/0.1-learning-map/"
title="1. Learning Map"
description="Understand the entire learning path and clarify the goals and outcomes of each stage."
/>
<NavCard
href="/en/stage-0/0.2-ai-capabilities-through-games/"
title="2. AI Era: If You Can Speak, You Can Code"
description="Experience the charm of AI programming through mini-games like Snake and break the fear of coding."
/>
</NavGrid>
### Product Prototype Practice
Master the Vibe Coding workflow, learn to deconstruct requirements, and independently complete high-fidelity Web application prototypes:
<NavGrid>
<NavCard
href="/zh-cn/stage-1/1.1-introduction-to-ai-ide/"
title="1. Master AI Programming Tools"
description="Learn about current mainstream AI programming tools and choose the most suitable development partner for you."
/>
<NavCard
href="/zh-cn/stage-1/1.0-finding-great-idea/"
title="2. Find Great Ideas"
description="Learn to find and validate product ideas and find projects worth doing."
/>
<NavCard
href="/zh-cn/stage-1/1.2-building-prototype/"
title="3. Build Product Prototypes"
description="Learn how to quickly transform product ideas into visual prototypes for low-cost trial and error."
/>
<NavCard
href="/zh-cn/stage-1/1.3-integrating-ai-capabilities/"
title="4. Integrate AI Capabilities"
description="Let your prototype have intelligent interaction capabilities by integrating simple AI APIs."
/>
<NavCard
href="/zh-cn/stage-1/1.4-complete-project-practice/"
title="5. Complete Project Practice"
description="Comprehensively use what you have learned to complete a complete product prototype development from 0 to 1."
/>
</NavGrid>
### Appendix: Business Thinking
**Why needed**: When you need to enhance product thinking and understand industry application scenarios, this content can help you establish a more comprehensive product perspective.
**When to watch**:
- Before starting to build a prototype, understand product thinking first to help you plan and design better.
- When you have a product idea but are unsure of the direction, refer to industry scenario cases.
- After completing the project, use product thinking to review and optimize your work.
<NavGrid>
<NavCard
href="/zh-cn/stage-1/appendix-a-product-thinking/"
title="Product Thinking and Solution Design"
description="Supplement necessary thinking models for product managers to improve demand analysis and product design capabilities."
/>
<NavCard
href="/zh-cn/stage-1/appendix-industry-scenarios/"
title="AI Industry Application Scenarios (B-end)"
description="Understand AI application scenarios in different industries to find product inspiration and direction."
/>
<NavCard
href="/zh-cn/stage-1/appendix-c-consumer-scenarios/"
title="AI Consumer Scenarios Inspiration (C-end)"
description="Explore AI application scenarios in consumer-grade products and inspire creative ideas."
/>
</NavGrid>
### Appendix: Technical Solutions
**Why needed**: When you encounter technical problems during development or want to know better tools, these technical appendices provide ready-to-use solutions.
**When to watch**:
- When you encounter an error and don't know how to solve it, check common errors and solutions.
- When you want to compare different AI programming tools, refer to platform measurement comparisons.
- When you want to learn more advanced development skills, check agent development cases.
<NavGrid>
<NavCard
href="/zh-cn/stage-1/appendix-b-common-errors/"
title="What to do if you encounter errors when coding"
description="Summarize common error messages and solutions during development to help you troubleshoot problems quickly."
/>
<NavCard
href="/zh-cn/stage-1/appendix-articles/example0-1/vibe-coding-tools-snake-game-tutorial"
title="Comparison of Seven AI Programming Tools"
description="Compare and test mainstream AI programming platforms to help you choose the most suitable tool."
/>
<NavCard
href="/zh-cn/stage-1/appendix-articles/example0-2/vibe-coding-tools-build-website-with-ai-coding-and-design-agents"
title="Design Websites using Design and Programming Agents"
description="Learn how to use AI agents to work together and improve development efficiency."
/>
</NavGrid>
## Suitable for
- Product managers and operations personnel with zero foundation
- Entrepreneurs who want to quickly validate ideas
- Non-technical individuals interested in AI programming
- Designers who hope to improve their prototype design capabilities
## How to Learn?
It is suggested to follow this order:
```
Play games to build confidence → Follow the tutorial to make a prototype → Add AI functions to the prototype → Independently complete a full project
```
Don't read everything at once; learning while doing is most effective. When you encounter problems, remember to check the appendix for solutions.
Start now, select the first section in the left navigation!
@@ -0,0 +1,361 @@
# Beginner Level 2: Learn AI Programming Tools
## Chapter Overview
<script setup>
const duration = 'About <strong>1 day</strong>, can be completed in multiple sessions'
</script>
<ChapterIntroduction :duration="duration" :tags="['Local Development Environment Setup', 'IDE vs AI IDE', 'Efficient Development Tips']" coreOutput="1 original game you create" expectedOutput="Built using Trae">
Previously, we experienced AI programming on z.ai, but the web version has many limitations — you **can't save your work anytime**, it's **hard to manage files**, and you **can't handle complex projects**. This chapter helps you move your development environment to your own computer so you can **truly build things independently**.
We'll first clarify **what the difference is between an IDE and an AI IDE**, and why the latter can **double your efficiency**. Then we'll **walk you through step by step** using Trae to build a Snake game locally, covering the **complete workflow** from installation to running. Finally, we'll share some **practical tips** for communicating with AI so you can avoid common pitfalls.
After completing this chapter, you'll have **mastered a development workflow similar to that of professional programmers**.
::: tip 💡 Advanced Tip
If you have some programming experience and want to use more powerful tools early on, you can refer to [Modern CLI Coding Tools](../../stage-2/backend/2.6-modern-cli/extra7/) to develop using the command line.
:::
</ChapterIntroduction>
<div style="margin: 50px 0;">
<ClientOnly>
<StepBar :active="0" :items="[
{ title: 'Understanding the Environment', description: 'IDE vs AI IDE' },
{ title: 'Hands-on Practice', description: 'Build Snake with Trae' },
{ title: 'Tool Deep Dive', description: 'Explore the IDE Interface' },
{ title: 'Communication Skills', description: 'Talk to AI Effectively' }
]" />
</ClientOnly>
</div>
## 1. What Environment and Tools Do You Need to Write Code
### 1.1 Mindset Shift: When in Doubt, Ask AI First
Before we introduce the various environments and tools, here's an important reminder: you need to **change your thinking habits**.
In traditional programming learning, if you need to install Python, configure Conda, or fix an npm installation failure, you'd typically open a search engine, find a tutorial, and follow the steps one by one. If you hit an error along the way, you'd search for the error message and try again repeatedly.
Wrong! ❌
In the AI era, especially when using an AI IDE, remember one core principle: **For any task, you can ask AI first, or even let it do it for you.**
- **Don't know how to set up your environment?** Just ask AI in the sidebar: "I want to write Python. Please check if Python is installed, and if not, install it for me."
- **Network stuck?** If installing dependencies keeps spinning or throwing errors, just throw the error to AI: "The download failed. Is it a network issue? Can you help me switch to a different mirror source?"
- **Can't remember commands?** No need to memorize Git or Conda commands. Just tell AI: "Help me create a new virtual environment called demo."
### 1.2 Why You Need an Environment and Tools
Going from "trying to write a few lines of code" to "building a long-term maintainable project" requires completely different environments and tools.
In theory, you could write code with the system's built-in Notepad, but problems quickly arise:
- **All code is plain black text** — keywords, strings, and comments are all mixed together, making it hard to see the structure at a glance
- **No smart suggestions** — you have to type every word completely by hand, and a single typo means repeatedly checking your code
- **Files become chaotic** — switching back and forth between dozens of files, often unable to find the line you need to edit
- **Debugging is guesswork** — when the program crashes, you don't know what went wrong and can only add print statements line by line
That's why you need an IDE (Integrated Development Environment). It displays code in different colors, provides auto-suggestions as you type, organizes files by project, and lets you trace errors step by step — making development more efficient and less error-prone.
## 2. What Is an IDE, and Why Do You Need One
::: info Pre-reading Tip
If you're not yet familiar with what an IDE is or what each interface element does, we recommend reading [IDE Basics](/zh-cn/appendix/2-development-tools/ide-basics) first to learn the basic concepts and common features.
:::
In the early days of programming, all we needed was a simple text editor and a language processor. But as projects grew more complex, developers urgently needed a tool that could efficiently manage files, support syntax highlighting, and enable debugging — and thus the Integrated Development Environment (IDE) was born.
You can think of an IDE as a program specifically designed to "edit, manage, run, and debug" code. Early IDEs looked very "primitive" and were operated almost entirely through the keyboard.
![](../../../zh-cn/stage-1/1.1-introduction-to-ai-ide/images/image1.png)![](../../../zh-cn/stage-1/1.1-introduction-to-ai-ide/images/image2.png)
Terminal Interface — Image source: https://en.wikipedia.org/wiki/File:Emacs-screenshot.png
Well-known and mature "built-in IDEs" like `Vim` are commonly used for remote server operations.
![](../../../zh-cn/stage-1/1.1-introduction-to-ai-ide/images/image3.png)
For greater efficiency, we need modern IDEs that support mouse interaction, typically including:
- **Source Code Editor**: Syntax highlighting, auto-completion.
- **Build and Run Tools**: Built-in compiler/interpreter.
- **Debugger**: Breakpoint debugging, variable inspection.
Modern IDEs often also include built-in tools like Git. The most popular is Microsoft's **[Visual Studio Code (VS Code)](https://code.visualstudio.com/)**, which is lightweight and extensible. While there are also professional IDEs like the JetBrains suite, VS Code is the most beginner-friendly.
![](../../../zh-cn/stage-1/1.1-introduction-to-ai-ide/images/image4.png)
VS Code's core philosophy is "everything is a plugin." Through its plugin system, it supports various languages — install the Python plugin and it becomes a Python IDE, install the C++ plugin and it becomes a C++ IDE. Without plugins, it's just an advanced text editor.
![](../../../zh-cn/stage-1/1.1-introduction-to-ai-ide/images/image5.png)
You can even use it to edit Markdown documents.
![](../../../zh-cn/stage-1/1.1-introduction-to-ai-ide/images/image6.png)
In short, an IDE is a set of tools that helps developers write code and run programs efficiently.
For more detailed explanations, check out the [Virtual IDE Visualization section in the Appendix](/zh-cn/appendix/2-development-tools/ide-basics).
## 3. How Is an AI IDE Different from a Regular IDE
A regular IDE (like the original VS Code) is essentially a "toolbox":
You can open projects, write code, run and debug, and install plugins — but the prerequisite is that you need to know what to do and how to do it yourself:
- When there's an error, you read the message yourself and figure out which line has the problem;
- When you want to add a new page or API endpoint, you find the right file and write the code yourself;
- When you want to configure the environment or build the project, you look up the documentation and follow the steps yourself.
But in an AI IDE, you can directly use a large language model to help you code and modify files:
- Just say "make a login page," and it generates the basic code structure first;
- Throw the error message and related code at it, and let it analyze the cause and suggest fixes;
- After you confirm, let it automatically create files, batch-edit code, and handle cross-file grunt work.
For example, you can select a piece of code and ask it to "refactor this" or "add comments." You can also ask in the sidebar "How is this project designed?" and specify the reference scope using `@filename` or `@entire project`, completing the tedious operations of creating files, writing code, and running with a single sentence.
In the latest version of VS Code, a large language model assistant is already built in. You can have conversations with the model about the entire codebase, a specific file, or even a specific function. You can also use it like the auto-coding tools you used on the web — send your requirements as prompts to the built-in coding Agent, and let it automatically implement the features you need, create files, modify code, configure environments, and more.
You can download and install VS Code, click the sidebar entry in the top-right corner, and open the AI feature area to experience these capabilities.
![](../../../zh-cn/stage-1/1.1-introduction-to-ai-ide/images/image7.png)
However, VS Code is not the IDE with the strongest AI capabilities. For scenarios that require heavy AI-assisted coding, we often want to use "smarter, more efficient" tools — a good AI IDE can significantly save time on writing code and fixing bugs. Below we'll introduce several popular AI IDEs. You can choose any AI IDE based on your personal preference.
Since VS Code is open source (anyone can download the source code and compile it themselves), the vast majority of AI IDEs on the market today are built on top of VS Code. So you don't need to worry about "learning many different IDEs" — **as long as you're familiar with the basics of VS Code**, migrating to these AI IDEs doesn't require starting from scratch.
Generally speaking, the differences between AI IDEs mainly come down to four aspects: pricing; available model types (some advanced models may be restricted in certain regions); Agent capabilities (how smart and capable it is at assisting with coding); and speed and performance. You can choose based on your own testing results — the best tool is the one that works best for you.
> Typical AI IDEs generally have the following core capabilities:
>
> - Smart Code Generation and Completion: In traditional IDEs, we typically type a few characters to auto-complete variable or function names. In modern AI IDEs, you can write a few lines of pseudocode or simply describe your requirements, and the IDE will auto-complete the full logic, or even generate large blocks of code based on instructions.
> - Code Understanding and Q&A: The IDE can understand and answer questions about a specific piece of code, a file, or even the entire project directory structure.
> - Code Refactoring and Optimization: The IDE can rewrite or optimize the implementation logic of specified code snippets based on your intent.
> - Automatic Test Generation: The IDE can automatically generate test code for different functions and modules, making it easy to perform targeted testing.
> - Agent-style Task Execution: Smart Agents can automatically generate, build, install, run, and modify code, partially replacing the work of junior software engineers in many tasks.
::: details Antigravity
### [Antigravity](https://antigravity.google/)
Antigravity is a brand-new AI IDE released by Google in November 2025 alongside Gemini 3, adopting an "Agent-First" development model. Unlike traditional AI-assisted coding, Antigravity makes the AI agent the "active executor," capable of directly operating the editor, terminal, browser, and other tools, taking on more "execution," "planning," and "verification" work. Developers only need to express high-level intent, and the agent will automatically break down tasks, create plans, execute code, run tests, and generate results. It supports multi-model switching, including Gemini 3 Pro, Claude Sonnet 4.5, and more. It's currently available as a public preview, supporting Windows, macOS, and Linux.
:::
::: details Trae
### [Trae](https://www.trae.ai/)
![](../../../zh-cn/stage-1/1.1-introduction-to-ai-ide/images/image8.png)
Trae is an AI programming assistant developed by ByteDance that supports over 100 programming languages and can be integrated into mainstream IDEs. Its features include: generating code from natural language, automatic debugging, and converting design mockups into React/Vue components. After its August 2025 update, Trae added smart dependency imports, rename suggestions, task checklist management, and more. SOLO mode also began supporting backend code generation and technical architecture document editing.
:::
::: details Cursor
### [Cursor](https://cursor.com/)
Cursor is an AI code editor developed by Anysphere, built on a customized VS Code, with optimizations focused on large-scale codebases and multi-file collaboration scenarios. It supports models like GPT-4o and Claude 3.7. The Claude Max mode introduced in 2025 can handle projects with millions of lines of code. The Pro version removed request limits, making it ideal for complex enterprise projects.
Currently, Cursor is arguably one of the best AI IDEs with a graphical interface in terms of overall experience, with a large user base and frequent feature updates. Its biggest drawback is the higher price — the Pro version costs about $20 per month.
![](../../../zh-cn/stage-1/1.1-introduction-to-ai-ide/images/image9.png)
:::
::: details Qoder
### [Qoder](https://qoder.com/)
Qoder is an AI IDE from Alibaba that emphasizes "transparent collaboration" and "enhanced context engineering capabilities." It supports breaking tasks into multiple steps through Action Flow and tracks AI execution in real time. It also supports multi-model dynamic routing and task state machine management, making it ideal for architecture governance in medium-to-large projects and "reverse engineering" analysis of legacy systems.
![](../../../zh-cn/stage-1/1.1-introduction-to-ai-ide/images/image10.png)
:::
::: details CodeBuddy
### [CodeBuddy](https://www.codebuddy.com/)
CodeBuddy is an AI programming tool from Tencent Cloud that emphasizes Chinese language command support and enterprise-grade compliance capabilities. It offers code completion, batch code review, and multi-model switching. Its Craft agent can perform multi-file code generation and API integration. The enterprise version supports private deployment and has passed Level 3 security certification, making it suitable for industries with high data security requirements such as finance and healthcare.
![](../../../zh-cn/stage-1/1.1-introduction-to-ai-ide/images/image11.png)
:::
::: details VS Code + Cline
### VS Code + [Cline](https://cline.bot/)
Cline is an AI programming Agent plugin for VS Code (Visual Studio Code) that can flexibly switch between different large models by configuring different API endpoints. Cline supports multimodal input, MCP tool extensions, and cost monitoring, with all operations requiring user confirmation before execution. It's ideal for quickly validating ideas or integrating with existing development workflows. Basic features are free, and the enterprise version supports deploying models in private environments.
![](../../../zh-cn/stage-1/1.1-introduction-to-ai-ide/images/image13.png)
![](../../../zh-cn/stage-1/1.1-introduction-to-ai-ide/images/image14.png)
:::
::: details Kiro
### [Kiro](https://kiro.dev/)
Kiro is an AI programming IDE from AWS (Amazon Web Services), deeply integrated with Amazon Bedrock and the AWS cloud service ecosystem. It supports multiple large models including Claude and Nova, making it particularly suitable for development scenarios that require tight integration with AWS cloud services. Kiro provides smart code generation, automated testing, and seamless integration with AWS resources (such as Lambda, S3, DynamoDB), offering unique advantages for cloud-native application development.
> **Note**: If you want to use Anthropic Claude models, you'll need to use Cursor, Kiro, or Antigravity as your IDE. These IDEs have official partnerships or deep integrations with Anthropic, providing a more stable and complete Claude model experience.
:::
<div style="margin: 50px 0;">
<ClientOnly>
<StepBar :active="1" :items="[
{ title: 'Understanding the Environment', description: 'IDE vs AI IDE' },
{ title: 'Hands-on Practice', description: 'Build Snake with Trae' },
{ title: 'Tool Deep Dive', description: 'Explore the IDE Interface' },
{ title: 'Communication Skills', description: 'Talk to AI Effectively' }
]" />
</ClientOnly>
</div>
## 4. Hands-on: Build a Snake Game Locally with an AI IDE
The previous sections were mainly about "concepts" and "differences." In this section, we'll turn abstract concepts into concrete actions through a complete hands-on exercise: **Create a new empty folder -> Open it with an AI IDE -> Chat in the sidebar and have it build a Snake game from scratch using React.** Here we'll use Trae as our example, so first we need to install it and understand what Trae is.
::: tip 💡 Quick Tip: Seamless Transition from Web to Local
If you've previously developed projects on z.ai or other web-based AI programming platforms, you can download the code directly to your local machine and open it with an AI IDE to continue development. This way you can keep your previous work while enjoying the more powerful AI assistance of a local IDE.
The steps are simple:
1. Click the download button on platforms like z.ai to save the project locally
2. Unzip and open the folder with an AI IDE like Trae/Cursor
3. Continue chatting with AI in the sidebar to iterate and improve your project
:::
### 4.1 Preparation: Install and Learn About Trae
#### 4.1.1 What Is Trae
Trae's full name can be understood as "The Real AI Engineer." It's an adaptive AI Integrated Development Environment (IDE) developed by ByteDance. It's built on top of the popular VS Code, which means if you're already familiar with VS Code, you'll find Trae's interface layout and basic operations very familiar and comfortable.
Trae's core goal is to be a developer's "smart programming partner." Through deep AI integration, it can automatically handle a large amount of repetitive work, providing you with a more intuitive and efficient development experience. It's not just a "code completion tool" — it aims to assist throughout the entire development workflow, from creating projects, writing code, debugging, testing, to deployment.
#### 4.1.2 Installing Trae
Trae comes in an international version and a China version. The international version requires access to overseas networks but lets you use the latest overseas models like GPT-5. The China version primarily supports the latest domestic large models such as GLM, Qwen, Kimi, etc.
International version download: https://www.trae.ai/
China version download: https://www.trae.cn/
##### Trae Pricing and Usage Options
::: info 💡 Version Selection Tips
- If you're primarily using it in China, we recommend choosing the China version for more stable network access and support for domestic large models
- If you need to use overseas models like GPT-5 and your network conditions allow it, you can choose the international version
- If you already have a third-party model API Key, connecting third-party models gives you flexible cost control
:::
> 💡 **Currently recommended: Use OpenRouter free models for testing**
>
> As of the time this tutorial was written (2026-02-12), you can still try StepFun's models for free. See section 4.2 below for how to connect the model `stepfun/step-3.5-flash:free`.
Regarding Trae's costs and usage options, here are several choices:
- **China Version (Recommended)**: Basic usage is free, but due to high user volume, you may need to wait in a queue.
- **International Version**: Subscription costs about $3 per month, giving access to overseas models like GPT-5, but requires overseas network access.
- **Third-party Model Integration**: If you already have a Token API from a domestic large model provider (such as DeepSeek, Tongyi Qianwen, Kimi, etc.), you can connect these APIs through Trae's third-party model configuration. Major cloud service providers (such as Alibaba Cloud, Tencent Cloud, Baidu Cloud, etc.) typically offer Coding Plan subscriptions that let you use their large model APIs at more favorable prices. This way you can freely choose your preferred model while controlling costs.
We recommend beginners start with the free China version. If you encounter queuing issues or need more stable service, consider connecting a third-party model and purchasing the corresponding cloud provider's Coding Plan.
#### 4.1.3 Trae Interface Overview
In terms of interface design, Trae is very similar to the VS Code we use daily: the same classic three-column layout with a file explorer on the left, an editing area in the center, and an extension panel on the right.
![](../../../zh-cn/stage-1/1.1-introduction-to-ai-ide/images/image17.png)
The sidebar on the right is the Copilot interaction window, which can also be thought of as the Agent window. If you can't see it right away, click the sidebar icon in the top-right corner of Trae to open it.
![](../../../zh-cn/stage-1/1.1-introduction-to-ai-ide/images/image18.png)
After opening the sidebar, you'll see a `Builder` option — this is the Agent mode. Simply put, it's like a "local version" of z.ai that can operate your local environment, install runtime environments, open web pages, and more.
![](../../../zh-cn/stage-1/1.1-introduction-to-ai-ide/images/image19.png)
After clicking "Builder," you'll see "Chat" mode and "Builder with MCP" mode:
- **Chat Mode**: Primarily used for chatting about the code in your current folder, or as a general chat model. (You can open a folder through the "File" menu in the top-left corner and edit within that folder. In this case, any files Builder creates or modifies will only happen inside this folder.)
- **Builder with MCP Mode**: Provides the Agent with more available tools (such as connecting the language model with other software, querying weather, etc.). You can simply understand it as: MCP makes it easier for the language model to call various external tools.
![](../../../zh-cn/stage-1/1.1-introduction-to-ai-ide/images/image20.png)
In the area below, you'll also see model selection options — click to change the current large model. In the China version, you can choose domestic models like Kimi k2 or GLM. If you're using the international version of Trae, you can also select overseas models like ChatGPT or Claude. However, since domestic large models are developing very rapidly, Kimi, Qwen, GLM, and others already offer experiences close to Claude 3.5 or 3.7 in many tasks, which is more than sufficient for daily development. There's no strict requirement to use the international or China version here.
**Note that we don't recommend using Auto mode (automatic model selection). For the international version, we recommend using Gemini or GPT models. For the China version, we recommend trying domestic models like Kimi k2, Minimax, or GLM.** Different models suit different use cases — there's no dogmatic rule about which is better. When you hit a wall with one model, try switching to another. Through multiple tests, you'll find the best results for your own workflow.
![](../../../zh-cn/stage-1/1.1-introduction-to-ai-ide/images/image21.png)
That's a brief introduction to Trae. Next, let's revisit what we did previously on z.ai and try doing the same thing in Trae.
### 4.2 Step 1: Create an Empty Folder and Open It with an AI IDE
Before getting started, we first need to prepare a clean project working directory.
For this section's example, you can create a new empty folder named `snake-game-react` on your local machine.
Then, open your installed AI IDE, select "Open Folder" on the startup screen, and import the empty folder as the project root directory. You can also drag the folder directly into the IDE window to open it. At this point, the file explorer on the left won't show any code files, indicating that we're starting from a completely blank project state.
::: details 📚 Optional: Connect a Cloud Service Provider's API or Coding Plan
This section introduces how to connect a cloud service provider's API or Coding Plan for more stable and frequent model calls. Screenshots of the Trae integration are provided at the end.
**What Is a Coding Plan**
A Coding Plan is a subscription offered by major cloud service providers. After purchasing, you can **use the provider's large model API without limits or at high frequency** for a certain period. Compared to per-token billing, a Coding Plan is more like a "monthly package" — you pay a fixed fee and can use it freely without worrying about per-call charges.
**Why Purchase a Coding Plan**
You might ask: since you can call large models directly via API, why buy a Coding Plan? The main reason is: **unlimited usage**. The core advantage of a Coding Plan is that you can call the large model anytime, as frequently as you want, without worrying about costs exploding or constantly checking billing statements.
**Recommended Domestic Cloud Service Coding Plans**
Here are recommended Coding Plan options from major domestic cloud service providers:
- Zhipu AI (BigModel Plan): https://bigmodel.cn/glm-coding
- Volcengine (ByteDance Cloud AI Plan): https://www.volcengine.com/activity/codingplan
> 💡 **You can also directly connect a large model API**
> Besides Coding Plans, you can also directly connect various model APIs through Add Model. You can refer to the method below for connecting the OpenRouter StepFun free API to integrate it with Trae. Testing shows it meets basic programming needs.
> If you need to top up, we suggest starting with a small amount (e.g., 10 RMB) to see how long it lasts, such as with cost-effective models like DeepSeek.
**How to Connect a Coding Plan**
Connecting a Coding Plan is very simple and takes just a few minutes:
1. Visit your chosen cloud service provider's website (e.g., Zhipu AI: https://bigmodel.cn/glm-coding, Volcengine: https://www.volcengine.com/activity/codingplan)
2. Register an account and log in
3. Find the "Pricing" or "Coding Plan" page
4. Choose a plan that suits you and complete the payment
5. After payment, you'll receive an API Key or Plan ID
::: tip 🎯 Custom Model Recommendations
When connecting custom models in Trae, we **recommend using the OpenRouter approach by default**. OpenRouter provides a unified API interface for conveniently connecting to multiple large language models.
**As of February 12, 2026, you can still use StepFun's free API:**
- **`stepfun/step-3.5-flash:free`**: A free model from StepFun that can be directly connected in Trae.
**Other free models:**
- **`openrouter/free`**: A model option that uses free LLM APIs by default. You can use it directly in Trae's Custom Model integration (just enter the model ID), experiencing AI programming features without any cost.
These free options are great for beginners. Before committing to production use, you can familiarize yourself with the AI IDE workflow through these free options.
**Optional: Connect a Large Model API (Using DeepSeek as an Example)**
1. Visit the DeepSeek platform: https://platform.deepseek.com/usage
2. Register an account and log in
3. Purchase a 10 RMB token package on the top-up page
4. After topping up, create and copy an API Key on the API Keys page
5. In Trae, click **"Add Model"**, find DeepSeek, select the corresponding model, and enter the API Key to start using it
Through the interface below, you can successfully add a model (note: after selecting the model option, **make sure to scroll all the way to the bottom** — there's a "Custom Model" option. Click it to enter a model ID, where you can type the recommended model IDs like `stepfun/step-3.5-flash:free`. Also click "Get Key" below to visit the official website and obtain the corresponding API Key.)
![](../../../zh-cn/stage-1/1.1-introduction-to-ai-ide/images/index-2026-02-12-14-14-51.png)
![](../../../zh-cn/stage-1/1.1-introduction-to-ai-ide/images/index-2026-02-12-14-15-29.png)
:::
@@ -0,0 +1,287 @@
---
title: 'Adding AI Capabilities to Your Prototype - Integrating Text and Image APIs'
description: 'Integrate real AI capabilities into your existing web prototype: understand the core concepts of APIs, learn how to find API Keys and official examples; hands-on integration of DeepSeek text model and various image generation services (SiliconFlow Qwen-Image, Recraft, Seedream), and master common model selection methods.'
---
<script setup>
const duration = 'About <strong>1 day</strong>'
</script>
# Beginner Level 4: Injecting AI Capabilities into Your Prototype
## Chapter Introduction
<ChapterIntroduction :duration="duration" :tags="['API', 'Text Model', 'Text-to-Image', 'Prototype Integration']" coreOutput="Prototype integrated with 1 text model + 1 image model (optional)" expectedOutput="AI prototype capable of calling real APIs">
In the previous chapters, we completed the entire process from **finding a great idea** to **building a product prototype**. But the current prototype is still just a "shell" — clicking buttons won't actually generate content, and all the data on the page is hardcoded.
Remember what we emphasized in the first chapter? **We want to build "products people are willing to pay for," not "prototypes that just look good."** Real value comes from a product that can **solve real problems**, and to achieve that, the prototype must be able to **actually run**.
This chapter will bring your prototype **"to life"**: we'll integrate **real AI capabilities**, starting from obtaining an API Key, reading official documentation, and having the AI IDE help you integrate the interface into your code. Using **DeepSeek's text model** as an example, you'll learn how to make your application **actually call a large language model to generate content**; if you're interested, you can also **optionally integrate image generation**.
After completing this chapter, your prototype will **no longer be a static demo**, but rather **an application that can call real AI capabilities and solve real problems**.
</ChapterIntroduction>
<div style="margin: 50px 0;">
<ClientOnly>
<StepBar :active="0" :items="[
{ title: 'API Basics', description: 'Understand core concepts and security practices' },
{ title: 'Text Integration', description: 'DeepSeek text generation hands-on' },
{ title: 'Image Integration', description: 'VLM image understanding and generation' }
]" />
</ClientOnly>
</div>
# 1. API Fundamentals
As mentioned earlier, our goal is to "integrate AI capabilities" so that the prototype is no longer a static demo but a tool that can call real AI services. The key to achieving this lies in understanding and using APIs (Application Programming Interfaces).
API is an important abstraction concept in computer science. Simply put: **you send a request in the format the other party requires, and they send back a result in the same format**.
- **What you send out**: Usually includes a "key (API Key)" and "what you want to generate"
- **What they send back**: If successful, you get the result; if it fails, they tell you why (e.g., "invalid key," "insufficient balance," "incorrect parameters")
Specifically, you need to master the following core elements:
1. **API Key**: Your "pass" and also your "wallet key." Anyone who gets it can make API calls on your behalf and incur charges.
2. **Endpoint**: The specific path for the API request, telling the server which function you want to access. The full request URL is typically composed of "Base URL + Endpoint path." For example:
- Text generation: Base URL (`https://api.service.com`) + Endpoint (`/v1/chat/completions`) = Full URL `https://api.service.com/v1/chat/completions`
- Image generation: Base URL (`https://api.service.com`) + Endpoint (`/v1/images/generations`) = Full URL `https://api.service.com/v1/images/generations`
3. **Call/Request**: The process of sending a task to the AI service and getting results back
4. **Request Content**: The specific content you send to the AI, such as the topic you want the AI to write about, the description of the image to generate, etc.
5. **Response**: The content the AI returns after processing, such as the generated article, image, etc.
6. **Error Handling**: Knowing how to troubleshoot when problems occur (such as incorrect API Key, too many requests, etc.)
::: info ️ What is an API
For a more in-depth explanation of APIs, see the appendix: [Introduction to APIs](/zh-cn/appendix/4-server-and-backend/api-intro).
::: warning 🔐 **API Security Notes**
The API Key is your "pass" for requesting AI services — it's a secret string used for authentication and billing.
Since the API Key is directly linked to your account and charges, be sure to:
- **Never share it** in group chats, screenshots uploaded online, or public forums
- **Never hardcode it** into your code and commit it to a Git repository (especially public repositories)
- If you suspect your Key has been leaked, **replace it with a new Key immediately**
In the content below, we will **paste the API KEY directly into the AI IDE for operations**. **Don't do this in real projects!!** Since we're just practicing, it's fine for now. (Once you're more experienced, you can have the AI generate a configuration file and simply put the API KEY in the config file.)
:::
<div style="margin: 50px 0;">
<ClientOnly>
<StepBar :active="1" :items="[
{ title: 'API Basics', description: 'Understand core concepts and security practices' },
{ title: 'Text Integration', description: 'DeepSeek text generation hands-on' },
{ title: 'Image Integration', description: 'VLM image understanding and generation' }
]" />
</ClientOnly>
</div>
# 2. Integrating the Text Generation API: DeepSeek
Although APIs involve these technical concepts, the actual operation during the prototyping phase can be very simple and efficient. The core approach is:
> **Find the official example, get the API Key, and have the AI IDE help you wire it to a button.**
Once you've grasped these concepts, you'll find that whether you're integrating a text model or an image model, the underlying process is the same: when the user clicks a button, the frontend organizes the input and sends a request; after the API returns a result, it displays the result on the page. Let's verify this through hands-on practice.
In `1.2 Building Your Prototype`, you already created an interactive prototype. What we need to do next is turn the "AI-like features" in the prototype into real, working capabilities: **when the user clicks a button, the prototype sends a request to an external AI service and displays the returned text.**
::: info ️ Further Reading on Principles
If you want to learn more about the underlying principles, check out the appendix: [Introduction to Large Language Models (LLM)](/zh-cn/appendix/8-artificial-intelligence/llm-principles).
::: details Learn More: What is DeepSeek?
**Hangzhou DeepSeek Artificial Intelligence Basic Technology Research Co., Ltd.**, operating under the brand name DeepSeek, is a **Chinese artificial intelligence (AI) company that develops large language models (LLMs)**. DeepSeek is headquartered in Hangzhou, Zhejiang, and is owned and funded by the Chinese hedge fund High-Flyer. DeepSeek was founded in July 2023 by Liang Wenfeng, co-founder of High-Flyer, who also serves as CEO of both companies. The company launched its eponymous chatbot and its DeepSeek-R1 model in January 2025.
Let's look at how DeepSeek compares with other top models in the GPQA benchmark rankings. Notably, DeepSeek is an open-source model (anyone can download the model from the internet), while other common models like Grok, Google Gemini, and ChatGPT are closed-source. As we can see, DeepSeek has largely caught up with the first tier of models.
![](../../../zh-cn/stage-1/1.3-integrating-ai-capabilities/images/index-2026-01-20-14-16-48.png)
GPQA stands for "Graduate-Level Google-Proof Q&A Benchmark," a graduate-level benchmark for scientific question-answering tasks. Here's a detailed introduction.
GPQA contains 448 multiple-choice questions covering subfields of biology, physics, and chemistry, such as quantum mechanics, organic chemistry, molecular biology, and more. These questions were written by 61 experts who hold or are pursuing doctoral degrees and have undergone a rigorous validation process.
:::
Follow these 3 steps to quickly integrate a large model generation API:
1. **Create an API Key on the DeepSeek platform**
2. **Find the text generation example in the DeepSeek documentation** (there's usually ready-made code you can copy directly)
3. **Open the AI IDE, paste in the API Key + official example**, and tell the AI what functionality to implement:
> Help me integrate this large model's API to support the copywriting generation task for this application
Next, we'll walk through a demo. You can follow along with the entire process. First, register a [DeepSeek](https://platform.deepseek.com/usage) account, create an API Key, and top up a small amount for testing.
![](../../../zh-cn/stage-1/1.3-integrating-ai-capabilities/images/index-2026-01-20-13-57-41.png)
![](../../../zh-cn/stage-1/1.3-integrating-ai-capabilities/images/index-2026-01-20-13-58-13.png)
Click "API KEYS" and find "create new API key" at the bottom of the screen. You'll end up with an API key that looks something like sk-8573341c39fc44315aadc071c53rh7d2.
![](../../../zh-cn/stage-1/1.3-integrating-ai-capabilities/images/index-2026-01-20-13-58-32.png)
Once you have the key, you have permission to call the model.
At this point, you can directly read the [API](https://api-docs.deepseek.com/) documentation, which typically provides curl or Python call examples.
![](../../../zh-cn/stage-1/1.3-integrating-ai-capabilities/images/index-2026-01-20-13-58-56.png)
After finding the example, you can copy all the content from the documentation along with your key into the AI IDE's chat box, asking it to help you integrate the large language model into the prototype you've already developed.
![](../../../zh-cn/stage-1/1.3-integrating-ai-capabilities/images/index-2026-01-20-13-59-31.png)
Here's a reference prompt:
```
Based on this API call method, help me implement a copywriting generation feature that can generate Douyin (TikTok) e-commerce copy in various styles based on product information when clicked.
Reference materials:
api key: sk-8573341c39aefa1efe
api request reference:
curl \
-H "Content-Type: application/json" \
-H "Authorization: Bearer ${DEEPSEEK_API_KEY}" \
-d '{
"model": "deepseek-chat",
"messages": [
{"role": "system", "content": "You are a helpful assistant."},
{"role": "user", "content": "Hello!"}
],
"stream": false
}'
```
After some AI code generation, you'll easily get a corresponding copywriting generation button to test. If you can't find the entry point, you can ask the AI IDE to tell you which page leads to it. If you really can't find it, you can ask the AI IDE to directly refactor and improve based on your ideas to get the final copywriting generation result.
![](../../../zh-cn/stage-1/1.3-integrating-ai-capabilities/images/index-2026-01-20-14-23-23.png)
![](../../../zh-cn/stage-1/1.3-integrating-ai-capabilities/images/index-2026-01-20-14-26-35.png)
Of course, you might be wondering: how do I know it's actually calling the large model and not just returning hardcoded responses? You can enter custom copy and have the large model generate corresponding content based on your custom analysis specified on the spot.
If you find that the results are different each time and logically coherent, you can be confident that the API is being called correctly. You can also check the [API usage management platform](https://platform.deepseek.com/usage) to see if the calls were successful (though it may take a few minutes to show up).
# 3. Integrating the Image-to-Text API: Qwen3 VL
::: info ️ Further Reading on Principles
If you want to learn more about the underlying principles, check out the appendix: [Introduction to Vision Language Models (VLM)](/zh-cn/appendix/8-artificial-intelligence/multimodal-models).
::: details Learn More: What is Qwen3 VL?
**Qwen3 VL** is the latest version in the multimodal vision-language model series developed by Alibaba Cloud's Tongyi Qianwen team. VL stands for "Vision-Language," meaning it's a vision-language model. It can understand image content and generate text descriptions based on images, answer questions about images, extract information from images, and more.
![](../../../zh-cn/stage-1/1.3-integrating-ai-capabilities/images/index-2026-01-20-14-48-27.png)
![](../../../zh-cn/stage-1/1.3-integrating-ai-capabilities/images/index-2026-01-20-14-48-41.png)
**Key capabilities of Qwen3 VL include:**
- **Image Understanding**: Can recognize objects, scenes, people, text, and other content in images
- **Visual Q&A**: Accurately answers questions about images based on user queries
- **Image Captioning**: Generates detailed or concise text descriptions of images
- **Multi-image Understanding**: Supports processing multiple images simultaneously for comparative analysis
- **Text Extraction**: Extracts text content from images (OCR capability)
**Why choose Qwen3 VL?**
Compared to the previous generation, Qwen3 VL has significantly improved image understanding accuracy and supports longer, more complex image analysis tasks. It excels in Chinese language understanding, has relatively low API call costs, and offers good value for money. Additionally, its larger context window enables it to handle more complex visual reasoning tasks.
**Typical use cases:**
- E-commerce: Automatically generate titles, descriptions, and selling points from product images
- Content creation: Automatically generate copy or image suggestions based on reference images
- Office: Image content extraction, automatic report recognition
- Education: Automatic parsing of image-based questions, knowledge point extraction
:::
In the previous section, we explained how to integrate a text generation API. But for the application scenario above, we'll notice a problem: we're uploading an image, and if we only use a large language model, it can't understand the content of the image very well, so the generated results may be off.
We want a model that can help us turn an image into a text description — this requires a Vision Language Model (VLM). In our case, we'll use a vision language model to generate product selling point descriptions, improving the user experience.
For convenience, we'll use the API provided by [SiliconFlow cloud platform](https://cloud.siliconflow.cn/me) to integrate the image-to-text API.
::: details Learn More: What is SiliconFlow?
**SiliconFlow** is a well-known AI model aggregation platform in China, providing API services for various mainstream large language models and vision language models.
**Platform features:**
- **Multi-model support**: Integrates various mainstream AI models, including DeepSeek, Qwen, Llama series, and other open-source models
- **Technical optimization**: Optimized inference for open-source models, providing low-latency, high-concurrency API services
- **Interface compatibility**: Provides OpenAI-compatible API interfaces for easy integration with existing applications
- **Pay-as-you-go**: Supports usage-based billing
SiliconFlow is relatively mature in inference services for open-source large models and is a common choice for using domestic open-source AI models.
:::
Go to the SiliconFlow platform homepage, where you'll see many models to choose from. Find the filter in the upper left corner, click to expand it, select the "Vision" tag, and you'll see many image-to-text models, such as Zhipu GLM-4.6V or Qwen3-VL.
![](../../../zh-cn/stage-1/1.3-integrating-ai-capabilities/images/index-2026-01-20-15-05-04.png)
You can choose any one to test. Here we'll use `Qwen/Qwen3-VL-8B-Instruct` as an example.
![](../../../zh-cn/stage-1/1.3-integrating-ai-capabilities/images/index-2026-01-20-15-07-44.png)
Go to the [SiliconFlow platform](https://cloud.siliconflow.cn/me/account/ak), click "Create New API Key" in the API Keys section to create a new API Key.
You can directly use the code below as reference code, and send it along with the generated API Key to the AI IDE for feature integration.
::: details Image-to-Text Reference Code
```python
from openai import OpenAI
from typing import Dict, Any, List
import base64
import os
SILICONFLOW_API_KEY: str = ""
SILICONFLOW_BASE_URL: str = "https://api.siliconflow.cn/v1/"
MODEL_NAME: str = "Qwen/Qwen3-VL-8B-Instruct"
def encode_image(image_path: str) -> str:
with open(image_path, "rb") as image_file:
return base64.b64encode(image_file.read()).decode('utf-8')
def get_vlm_completion(client: OpenAI, messages: List[Dict[str, Any]]) -> str:
response = client.chat.completions.create(
model=MODEL_NAME,
messages=messages,
max_tokens=512,
temperature=0.7,
top_p=0.7,
frequency_penalty=0.5,
stream=False,
n=1
)
return response.choices[0].message.content
def caption_image(image_path: str) -> str:
base64_image = encode_image(image_path)
messages = [
{
"role": "user",
"content": [
{
"type": "text",
"text": "Please describe this image in detail."
},
{
"type": "image_url",
"image_url": {
"url": f"data:image/jpeg;base64,{base64_image}"
}
}
]
}
]
client = OpenAI(
api_key=SILICONFLOW_API_KEY,
base_url=SILICONFLOW_BASE_URL
)
return get_vlm_completion(client, messages)
image_path = "images.jpg"
caption = caption_image(image_path)
```
:::
+126
View File
@@ -0,0 +1,126 @@
# Full-Stack Development
Welcome to the **Full-Stack Development** stage! Here, you will dive deep into full-stack development, mastering frontend componentization, database design, backend API development, and deployment.
## What You Will Learn
### Frontend Development
Master modern frontend development and learn to use component libraries and design tools:
<NavGrid>
<NavCard
href="#"
title="Frontend 0: Using Lovart for Assets"
description="Learn how to use AI tools like Lovart to quickly generate high-quality game assets and UI resources"
/>
<NavCard
href="#"
title="Frontend 1: Figma & MasterGo Basics"
description="Master the basic operations of professional UI design tools and the workflow from design to code"
/>
<NavCard
href="#"
title="Frontend 2: Building Your First Modern App - UI Design"
description="Design a modern web application interface from scratch, practicing UI design principles"
/>
<NavCard
href="#"
title="Frontend 3: UI Design Guidelines & Multi-Product UI"
description="Learn mainstream UI design guidelines to improve product design consistency and aesthetics"
/>
<NavCard
href="#"
title="Frontend 4: Let's Build Hogwarts Portraits"
description="Practical project: Build an interactive Hogwarts portrait application using AI-generated images"
/>
</NavGrid>
### Backend & Full-Stack
Learn API design, database management, and application deployment strategies:
<NavGrid>
<NavCard
href="#"
title="Backend 1: What is API"
description="Understand the core concept of APIs, the bridge between frontend and backend"
/>
<NavCard
href="#"
title="Backend 2: From Database to Supabase"
description="Master relational database basics and learn to use Supabase, a modern BaaS platform"
/>
<NavCard
href="#"
title="Backend 3: AI-Assisted Interface Code & Documentation"
description="Use AI to assist in generating backend interface code and standard API documentation"
/>
<NavCard
href="#"
title="Backend 4: Git Workflow"
description="Master core operations and collaboration workflows of the Git version control system"
/>
<NavCard
href="#"
title="Backend 5: Zeabur Deployment"
description="Learn to quickly deploy your full-stack applications to the cloud using Zeabur"
/>
<NavCard
href="#"
title="Backend 6: Modern CLI Development Tools"
description="Explore modern CLI tools to enhance command-line development experience"
/>
<NavCard
href="#"
title="Backend 7: Integrating Stripe Payment Systems"
description="Practical: Integrate Stripe payment functionality into your application for monetization"
/>
</NavGrid>
### Assignments
Consolidate your full-stack development skills through practical projects:
<NavGrid>
<NavCard
href="#"
title="Assignment 1: Building Your First Modern App - Full-Stack"
description="Comprehensively apply what you've learned to independently complete a fully functional full-stack application"
/>
<NavCard
href="#"
title="Assignment 2: Modern Frontend Component Library + Trae"
description="Use modern component libraries with Trae IDE to efficiently build complex frontend interfaces"
/>
</NavGrid>
### AI Capabilities Extension
<NavGrid>
<NavCard
href="#"
title="AI 1: Dify Basics & Knowledge Base Integration"
description="Learn to build AI applications using Dify and integrate private knowledge bases"
/>
<NavCard
href="#"
title="AI 2: AI Dictionary Query & Multimodal API Integration"
description="Explore more AI capabilities, integrating vision, voice, and other multimodal APIs"
/>
</NavGrid>
## Who Is This For
- Developers with some programming foundation who want to systematically learn full-stack development
- Learners transitioning from product manager to full-stack engineer
- Junior to intermediate developers who want to master modern development tools and workflows
- Entrepreneurs who want to independently develop complete products
## Prerequisites
- Complete the "Novice & Product Prototype" stage, or have equivalent basic knowledge
- Understand basic HTML/CSS/JavaScript concepts
- Have preliminary knowledge of AI programming tools
Ready to dive deep into full-stack development? Click the left navigation to begin learning!
+93
View File
@@ -0,0 +1,93 @@
# Advanced Development
Welcome to the **Advanced Development** stage! Here, you will build complex cross-platform applications, master WeChat Mini Program development, and challenge yourself with more advanced AI-native application development.
## What You Will Learn
### Core Skills
Deeply master the MCP protocol and Claude Code advanced techniques to improve development efficiency:
<NavGrid>
<NavCard
href="#"
title="Advanced 1: MCP & Claude Code Skills"
description="Master Model Context Protocol (MCP) to extend the capabilities of AI programming tools"
/>
<NavCard
href="#"
title="Advanced 2: Long-Running Tasks"
description="Learn how to make AI coding tools handle long-running complex tasks"
/>
</NavGrid>
### Cross-Platform Development
Build WeChat Mini Programs, Android, and iOS applications to achieve cross-platform coverage:
<NavGrid>
<NavCard
href="#"
title="Advanced 3: Building WeChat Mini Programs"
description="Develop WeChat Mini Programs from scratch, mastering core mini program development workflows"
/>
<NavCard
href="#"
title="Advanced 4: WeChat Mini Programs with Backend"
description="Build complete WeChat Mini Program applications with backend support"
/>
<NavCard
href="#"
title="Advanced 5: Building Android Apps"
description="Use modern cross-platform frameworks to build Android native applications"
/>
<NavCard
href="#"
title="Advanced 6: Building iOS Apps"
description="Develop and publish iOS applications, mastering iOS ecosystem development standards"
/>
</NavGrid>
### Personal Brand
Build your own personal website and tech blog to establish personal influence:
<NavGrid>
<NavCard
href="#"
title="Advanced 7: Building Your Personal Website & Academic Blog"
description="Use modern technology stacks to build high-performance, visually appealing personal blogs"
/>
</NavGrid>
### Advanced AI Capabilities
Explore advanced AI technologies like RAG and LangGraph to build complex AI application workflows:
<NavGrid>
<NavCard
href="#"
title="Advanced AI 1: What is RAG and How It Works"
description="Deeply understand the principles of Retrieval-Augmented Generation (RAG) and its value in AI applications"
/>
<NavCard
href="#"
title="Advanced AI 2: Advanced RAG & Workflow Orchestration - LangGraph"
description="Learn to use LangGraph to orchestrate complex AI workflows and build advanced RAG systems"
/>
</NavGrid>
## Who Is This For
- Advanced developers with full-stack development experience who want to challenge more complex applications
- Engineers who want to master cross-platform development technologies
- Explorers who want to deeply understand AI-native application development
- Tech bloggers who want to build their personal technical brand
## Prerequisites
- Complete the "Full-Stack Development" stage, or have full-stack development experience
- Familiar with frontend frameworks (such as React/Vue) and backend development
- Understand basic AI concepts and API usage
Ready to challenge advanced development? Click the left navigation to begin learning!