ib-mcp-cache-server
Language:
JavaScript
Stars:
2
Forks:
0
Memory Cache Server
A Model Context Protocol (MCP) server that reduces token consumption by efficiently caching data between language model interactions. Works with any MCP client and any language model that uses tokens.
Installation
- Clone the repository:
git clone git@github.com:ibproduct/ib-mcp-cache-server
cd ib-mcp-cache-server
- Install dependencies:
npm install
- Build the project:
npm run build
- Add to your MCP client settings:
{
"mcpServers": {
"memory-cache": {
"command": "node",
"args": ["/path/to/ib-mcp-cache-server/build/index.js"]
}
}
}
- The server will automatically start when you use your MCP client
Verifying It Works
When the server is running properly, you'll see:
- A message in the terminal: "Memory Cache MCP server running on stdio"
- Improved performance when accessing the same data multiple times
- No action required from you - the caching happens automatically
You can verify the server is running by:
- Opening your MCP client
- Looking for any error messages in the terminal where you started the server
- Performing operations that would benefit from caching (like reading the same file multiple times)
Configuration
The server can be configured through config.json
or environment variables:
{
"maxEntries": 1000, // Maximum number of items in cache
"maxMemory": 104857600, // Maximum memory usage in bytes (100MB)
"defaultTTL": 3600, // Default time-to-live in seconds (1 hour)
"checkInterval": 60000, // Cleanup interval in milliseconds (1 minute)
"statsInterval": 30000 // Stats update interval in milliseconds (30 seconds)
}
Configuration Settings Explained
-
maxEntries (default: 1000)
- Maximum number of items that can be stored in cache
- Prevents cache from growing indefinitely
- When exceeded, oldest unused items are removed first
-
maxMemory (default: 100MB)
- Maximum memory usage in bytes
- Prevents excessive memory consumption
- When exceeded, least recently used items are removed
-
defaultTTL (default: 1 hour)
- How long items stay in cache by default
- Items are automatically removed after this time
- Prevents stale data from consuming memory
-
checkInterval (default: 1 minute)
- How often the server checks for expired items
- Lower values keep memory usage more accurate
- Higher values reduce CPU usage
-
statsInterval (default: 30 seconds)
- How often cache statistics are updated
- Affects accuracy of hit/miss rates
- Helps monitor cache effectiveness
How It Reduces Token Consumption
The memory cache server reduces token consumption by automatically storing data that would otherwise need to be re-sent between you and the language model. You don't need to do anything special - the caching happens automatically when you interact with any language model through your MCP client.
Here are some examples of what gets cached:
1. File Content Caching
When reading a file multiple times:
- First time: Full file content is read and cached
- Subsequent times: Content is retrieved from cache instead of re-reading the file
- Result: Fewer tokens used for repeated file operations
2. Computation Results
When performing calculations or analysis:
- First time: Full computation is performed and results are cached
- Subsequent times: Results are retrieved from cache if the input is the same
- Result: Fewer tokens used for repeated computations
3. Frequently Accessed Data
When the same data is needed multiple times:
- First time: Data is processed and cached
- Subsequent times: Data is retrieved from cache until TTL expires
- Result: Fewer tokens used for accessing the same information
Automatic Cache Management
The server automatically manages the caching process by:
-
Storing data when first encountered
-
Serving cached data when available
- Removing old/unused data based on settings
-
Tracking effectiveness through statistics
Optimization Tips
1. Set Appropriate TTLs
- Shorter for frequently changing data
- Longer for static content
2. Adjust Memory Limits
- Higher for more caching (more token savings)
- Lower if memory usage is a concern
3. Monitor Cache Stats
- High hit rate = good token savings
- Low hit rate = adjust TTL or limits
Environment Variable Configuration
You can override config.json settings using environment variables in your MCP settings:
{
"mcpServers": {
"memory-cache": {
"command": "node",
"args": ["/path/to/build/index.js"],
"env": {
"MAX_ENTRIES": "5000",
"MAX_MEMORY": "209715200", // 200MB
"DEFAULT_TTL": "7200", // 2 hours
"CHECK_INTERVAL": "120000", // 2 minutes
"STATS_INTERVAL": "60000" // 1 minute
}
}
}
}
You can also specify a custom config file location:
{
"env": {
"CONFIG_PATH": "/path/to/your/config.json"
}
}
The server will:
- Look for config.json in its directory
- Apply any environment variable overrides
- Use default values if neither is specified
Testing the Cache in Practice
To see the cache in action, try these scenarios:
-
File Reading Test
- Read and analyze a large file
- Ask the same question about the file again
- The second response should be faster as the file content is cached
-
Data Analysis Test
- Perform analysis on some data
- Request the same analysis again
- The second analysis should use cached results
-
Project Navigation Test
- Explore a project's structure
- Query the same files/directories again
- Directory listings and file contents will be served from cache
The cache is working when you notice:
- Faster responses for repeated operations
- Consistent answers about unchanged content
- No need to re-read files that haven't changed