Organized Your Ideas With Claude Code AI Agent and Obsidian [Guide & Setup & Agent Skills]
By Tony Huang
Summary
## Key takeaways - **Voice Memos Capture Relaxed Ideas**: Sitting in front of screen does not bring great ideas. Voice memo is a great alternative when brain is relaxed like walking or showering, to record project ideas, frustrations, reflections. [00:31], [00:43] - **Three Pain Points of Voice Memos**: Transcripts cannot be auto-imported to Obsidian, content is raw and messy, no clear title or tag, making it a nightmare to manage in file system. [01:05], [01:15] - **Third-Party Solutions' Defects**: Third party solutions have data security risks by sending voice memos to AI providers, high extra subscription costs like $10/month, and lack customizability to fit your file system. [01:28], [01:42] - **Claude Code Agent Skills Download**: Download agent skills package from community, unpack to get voice memo agent skills folder including Claude instructions, note properties for tagging, and .env for OpenAI API key. [02:57], [03:31] - **AI Transcribes and Refines Notes**: AI agent processes voice memos from last month, transcribes to raw texts saving which are processed in database, then refines into Obsidian notes with title, date, summary, and tags. [08:46], [09:42] - **Obsidian Management with Tags**: Refined notes include note itself and properties; use tags from tag list to filter and manage massive voice memo files by topics in Obsidian. [07:00], [11:26]
Topics Covered
- Screens Block Great Ideas
- Voice Memos Beat Third-Party Tools
- AI Agents Download Skills Instantly
- AI Auto-Tags Raw Transcripts
Full Transcript
You can simply ask your AI agent to process all your voice memos. It will
automatically transcribe your memos into words and refine it into structured obsidian notes with titles, summaries, and even no tags attached.
Hi, Tony here. I build AI agents for my clients. But in this channel, I'd like
clients. But in this channel, I'd like to explore how we can use AI agents to boost up our daily life. If you like me love writing, thinking, and creating, you've probably felt this too. Sitting
in front of screen does not bring you a great ideas. I find myself more likely
great ideas. I find myself more likely to have a greater idea when my brain is relaxed. For example, when I was walking
relaxed. For example, when I was walking or taking a shower, but when inspiration comes, there's no keyboard for me to type. On the other hand, voice memo is a
type. On the other hand, voice memo is a great alternative. You can start
great alternative. You can start speaking with a press of a button. You
can record anything, new project ideas, frustrations, reflections, or your opinions about any topics. These
recording are even more valuable now.
You can use this as a background information to have a meaningful discussion with your AI. However,
managing them is kind of nightmare. The
transcripts cannot be autoimp imported to my Obsidian note system. And the
content is row and messy. It's not
friendly for reading and review. You
don't have a clear title or a tag. It
will be very difficult to manage in a file system. Yes, there are some
file system. Yes, there are some solutions on the market to help you to complete the record, transcript, and note pipeline. But there are some
note pipeline. But there are some downsides. The first is the data
downsides. The first is the data security. Those third party solutions
security. Those third party solutions will send your voice memo to an AI provider. Why not directly use AI
provider. Why not directly use AI provider service? It's all available on
provider service? It's all available on the market. The second is price. Um I
the market. The second is price. Um I
believe all of you have already got an AI subscriptions. It is already cost me
AI subscriptions. It is already cost me a lot of money. I don't want to add another subscription cost me like $10 each month for such simple thing. The
third reason is definitely the customizability. I would like my
customizability. I would like my transcript to fit my file management system, not the other way around. So I
wrote my AI application to automate the task 6 months ago. Although I like to share it out, but it's not user friendly for non-coders.
Then recently cloud code had a major upgrade on the AI agent skills. It
basically allows your AI agent to learn my AI agent skills by simply downloading it. Sounds like what happened in Matrix,
it. Sounds like what happened in Matrix, isn't it? So you don't need to learn
isn't it? So you don't need to learn Python. Your AI agent will do it for
Python. Your AI agent will do it for you. It sounds like a beginning of new
you. It sounds like a beginning of new era. You should definitely try it out
era. You should definitely try it out yourself. Okay, this is an example of
yourself. Okay, this is an example of CDN vote. What we need to do is to
CDN vote. What we need to do is to prepare two new folders. One is for the row transcripts of the moist memo and
another one is the uh refined obsidian notes from our row transcript.
All right. Uh once it is ready uh we can use visual studio to open it up. Okay.
The next step is to download the agent skill package from our community.
You can simply find the post and click the zip file.
You can unpack the zip file and we will have a new folder called voice memo agent skills. It include cloud.
agent skills. It include cloud.
This is an agent skill folder and note property. Uh this actually tells your
property. Uh this actually tells your agent how to tag your voice memo.
All right, the last one is the env file.
This is an example. This is the place where you save your open AI API key. All
right, let's rename it asenv and save your open AI key here. All right, once you've done the basic setting is ready.
Your obsidian folder structure on the visual studio should look like this.
Okay, now we have all the file folders prepared. Let's dig a little bit deeper.
prepared. Let's dig a little bit deeper.
In the docloud folder, it includes all the prompt instructions for your AI agent. The first one is the prep agent
agent. The first one is the prep agent skill command. It is actually the
skill command. It is actually the command to ask your cloud AI agent to exam your computer environment to check if it is ready. If not, it will help you
fix it. All right, let's drilling down
fix it. All right, let's drilling down to skills. Uh, apparently we got two
to skills. Uh, apparently we got two skills. The first one is the uh voice
skills. The first one is the uh voice memo process and another one is refine memo for Obsidian. Uh, if you open the voice memo process, you'll find it
include a skill markdown file and a script. The skill markdown file is the
script. The skill markdown file is the instruction and the script will be the tool of your AI agent to process all your voice memo. Okay? Compared with the
second one, it does not have a script.
It only has a skill markdown file to instruct your AI agent of how to get things done. Okay, now we have the basic
things done. Okay, now we have the basic understanding of the structure. Let's
have a look at the configurations you need to set up before your agent start running. You may have a close look at
running. You may have a close look at it. Um but the one configuration file
it. Um but the one configuration file you need to care about is this config.
YL file. Um it will actually control how
file. Um it will actually control how the program works.
To find it, it's really simple. Just uh
press command and click. I did ask my agent prepare a detailed instruction which is the green text here. So um
let's have a look at it. You will find the transcription model. Um in our case it is GBT40 transcribe and you need to set up your u native language so the
model will process it in a more accurate manner and you can leave the temperature as it is. All right. Um let's have a look at the source. Okay. The voice memo
directory is where your voice memo saved. Um since I use iPhone voice memo
saved. Um since I use iPhone voice memo and MacBook this is the default directory. Okay. Okay, the next one you
directory. Okay. Okay, the next one you must have it prepared is the output directory where your transcribe will
save. Um in the visual studio is really
save. Um in the visual studio is really simple. You just right click target
simple. You just right click target folder and click copy path and you will save your absolute file path to the configuration.
The remainings are all the basic setups and uh uh including a database and a safeguard. Let's have a look at the
safeguard. Let's have a look at the refined memo for Obsidian. There are um couple of firepaths for you to update.
This is the one we instruct cloud code to update your transcript into obsidian style notes. There are three file paths
style notes. There are three file paths you need to update. It is where your transcript was saved and uh where you
want your transcribe to be saved. And
the last one is your tech list. Since we
would like to manage our massive voice memo um files, it's better to have your AI to do the data tagging for you. So
you will be able to filter it out for the topics you care about. You can have a look at the tag list file. Uh this is a basic default file here. You can
change it to anything you want. Just
ensure the format is similar and your AI agent will learn from it and help you manage your voice memo notes. Okay. Now
we have everything set up. We can run cloud code. Let's open up the terminal
cloud code. Let's open up the terminal and drag it on your right hand side and simply type cloud.
All right. If this is your first time to initiate your cloud, um you will be asked to uh log in with your cloud code subscription or provide an API key.
Okay. If you don't know how to install cloud code, simply click the link down below in the description. Once you have the cloud code ready, uh simply ask what skills you have. Let's check it out. See
if the skills are successfully implemented.
Before we run the skills, um especially for any of you who don't have coding experience, just run this prep command, the cloud code will automatically check
your computer environment to ensure everything is ready. It won't take long.
Um when the cloud code ask for your permission, just press yes, right? Um it
will check all the relevant code and uh your system uh settings. Once it is down it will give you a report. Okay
everything is ready. We can check cloud report. So the skill one voicemail
report. So the skill one voicemail process it it passed the check um API is ready and the drive around the root is ready and
the second skill the status is not available because it does not have any script. All right, good to go. What you
script. All right, good to go. What you
suggest to do is to type clear to clean the memory. Okay, help me process the
the memory. Okay, help me process the last month voice memo. What agent will do is that it will look into your voice
memo and pick up the voice memos that were created in the last month.
All right, it seems that our agent find of them. If you look deeper, you will
of them. If you look deeper, you will find out that this data were recorded in the database. So the next time when you
the database. So the next time when you ask your AI agent to check uh the voice memo, it will know uh which has been processed, which has not been processed.
It makes the system works more smoother.
All right, we simply tell your cloud code to proceed.
Take a couple minutes to complete. If we
have a close look at the result, we have four success and one fail because it has no content in it. Okay, things is ready.
We can check it out. Um, so it's in our obsidian note folder. It will ask you, would you like to refine those raw texts
into Obsidian style notes?
Well, you can simply say yes or you can do the other way around. As I said, uh you can choose to clean the memory and ask your AI agent to refine your obsidian notes or you can compact the
memory and tell the agent refine my voice memories into the MD notes.
Okay, we have four in hands. Which one
would you like to process?
Okay, now you can ask your agent to process all of them or you can select a specific uh note to process. Say we got
we got a specific target here. You can
simply type refine and at the file to tell cloud code to do the job, right? It
won't take long.
Okay, the job is done. Since my original voice memo is in Chinese, they got me a Chinese version. I would tell my AI
Chinese version. I would tell my AI agent to refine it into an English version.
Done.
This is where sometimes you are not satisfied with your voice memo. You can
always tell your AI agent to update the result to do it again. No complaints.
The result is pretty neat. We got a title. We got a date. We got a summary.
title. We got a date. We got a summary.
We got tag. It extract the workflow we discussed in the text truck. Well,
perfect for further review. Let's have a look at the result in our obsidian.
Well, it includes the note itself and note property.
If you know how to use space, you can simply manage all of our voice notes with tags. You can add time and tags to
with tags. You can add time and tags to manage all your voice memos and notes.
Thanks for watching. I'll see you next time. Bye-bye.
time. Bye-bye.
Loading video analysis...