MCP for dummies

You have probably heard of MCP, which stands for โ€œModel Context Protocolโ€. But what does it actually mean?

๐—ง๐—Ÿ;๐——๐—ฅ: MCP lets ChatGPT use your tools, not just OpenAIโ€™s. Itโ€™s not just about giving LLMs more information. Itโ€™s about giving them capabilities that make it work on your domain, data, and workflows.

After reading the docs and watching tutorials, I still wasnโ€™t clear.

Everything clicked when I built an MCP server myself. This post is my attempt to spare you that confusion.

Letโ€™s start from the basics. LLMs are limited. By themselves, they donโ€™t know whatโ€™s happening in the world right now. They donโ€™t have access to your data. And they canโ€™t act on your behalf.

Theyโ€™re trained until a fixed point in time. So, if you ask โ€œWho won the IPL this year?โ€โ€”they shouldn't know. But they do. How?

Tools.

When ChatGPT realises it canโ€™t answer your question directly, it triggers its search tool to fetch the most relevant posts on the internet, reads them and answers based on that. The tool fills a gap in its abilities.

OpenAI has added many such tools inside ChatGPT. However, the tools we can use on ChatGPT are limited to the ones OpenAI adds. MCP changes that. Since ChatGPT does not support MCP fully yet, we will use Claude going forward.

Letโ€™s say you want Claude to check your expenses every morning and send you a summary if your monthly spending has crossed a threshold.

How can you do that?

๐Ÿญ. ๐——๐—ฒ๐—ณ๐—ถ๐—ป๐—ฒ ๐˜†๐—ผ๐˜‚๐—ฟ ๐—ผ๐˜„๐—ป ๐˜๐—ผ๐—ผ๐—น๐˜€

Here, "tool" is just a small piece of software, a program or script, that performs a specific task for a specific input. You can think of it as something that can perform a specific type of query on your database (e.g. fetch, filter and sort data) or carry out an action like sending a message or generating a report.

๐Ÿฎ. ๐—ช๐—ฟ๐—ฎ๐—ฝ ๐˜๐—ต๐—ฒ๐˜€๐—ฒ ๐˜๐—ผ๐—ผ๐—น๐˜€ ๐—ถ๐—ป๐˜€๐—ถ๐—ฑ๐—ฒ ๐—ฎ๐—ป ๐— ๐—–๐—ฃ ๐˜€๐—ฒ๐—ฟ๐˜ƒ๐—ฒ๐—ฟ

MCP is simply a protocol, a common language that defines how tools should be defined and exposed so that Claude can talk to them. An MCP server is just a backend that follows this protocol and exposes your tools like an API. It acts as the bridge between your tools and Claude.

๐Ÿฏ. ๐—–๐—ผ๐—ป๐—ป๐—ฒ๐—ฐ๐˜ ๐—–๐—น๐—ฎ๐˜‚๐—ฑ๐—ฒ ๐˜๐—ผ ๐˜๐—ต๐—ถ๐˜€ ๐˜€๐—ฒ๐—ฟ๐˜ƒ๐—ฒ๐—ฟ

Once connected, the LLM can decide โ€œThis looks like a job for one of your toolsโ€, figure out which tool to use instead of the pre-built ones, pass in the right inputs, get the result, and respond back.

For example, if your tool gives Claude the ability to query your internal database, the rest of your team can simply ask it for what they want to analyse and Claude will figure out the right tool to invoke, get back the data, write code to analyse that data, run the code and generate plots or build entire webapps for them to use.

No dev time. No long waiting times. No context switching between interfaces.

A tool could even be as advanced as an agent that calls other tools, connected to other MCP servers, letting you chain together multiple systems and build powerful workflows.