vmtouch – the Virtual Memory Toucher
Portable file system cache diagnostics and control
vmtouchis a tool for learning about and controlling the file system cache of unix and unix-like systems. It is BSD licensed so you can basically do whatever you want with it.
Quick install guide:
$ git clone https://github.com/hoytech/vmtouch.git $ cd vmtouch $ make $ sudo make install
What is it good for?
- Discovering which files your OS is caching
- Telling the OS to cache or evict certain files or regions of files
- Locking files into memory so the OS won’t evict them
- Preserving virtual memory profile when failing over servers
- Keeping a "hot-standby" file-server
- Plotting filesystem cache usage over time
- Maintaining "soft quotas" of cache usage
- Speeding up batch/cron jobs
- And much more…
How much of the /bin/ directory is currently in cache?
$ vmtouch /bin/ Files: 92 Directories: 1 Resident Pages: 348/1307 1M/5M 26.6% Elapsed: 0.003426 seconds
How much of big-dataset.txt is currently in memory?
$ vmtouch -v big-dataset.txt big-dataset.txt [ ] 0/42116 Files: 1 Directories: 0 Resident Pages: 0/42116 0/164M 0% Elapsed: 0.005182 seconds
None of it. Now let’s bring part of it into memory with tail :
$ tail -n 10000 big-dataset.txt > /dev/null
Now how much?
$ vmtouch -v big-dataset.txt big-dataset.txt [ oOOOOOOO] 4950/42116 Files: 1 Directories: 0 Resident Pages: 4950/42116 19M/164M 11.8% Elapsed: 0.006706 seconds
vmtouch tells us that 4950 pages at the end of the file are now resident in memory.
Let’s touch the rest of /big-dataset.txt/ and bring it into memory (pressing enter a few times to illustrate the animated progress bar you will see on your terminal):
$ vmtouch -vt big-dataset.txt big-dataset.txt [OOo oOOOOOOO] 6887/42116 [OOOOOOOOo oOOOOOOO] 10631/42116 [OOOOOOOOOOOOOOo oOOOOOOO] 15351/42116 [OOOOOOOOOOOOOOOOOOOOOo oOOOOOOO] 19719/42116 [OOOOOOOOOOOOOOOOOOOOOOOOOOOo oOOOOOOO] 24183/42116 [OOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOo oOOOOOOO] 28615/42116 [OOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOo oOOOOOOO] 31415/42116 [OOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOo oOOOOOOO] 36775/42116 [OOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOo oOOOOOOO] 39431/42116 [OOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOO] 42116/42116 Files: 1 Directories: 0 Touched Pages: 42116 (164M) Elapsed: 12.107 seconds
We have 3 big datasets, a.txt , b.txt , and c.txt but only 2 of them will fit in memory at once. If we have a.txt and b.txt in memory but would now like to work with b.txt and c.txt , we could just start loading up c.txt but then our system would evict pages from both a.txt (which we want) and b.txt (which we don’t want).
So let’s give the system a hint and evict a.txt from memory, making room for c.txt :
$ vmtouch -ve a.txt Evicting a.txt Files: 1 Directories: 0 Evicted Pages: 42116 (164M) Elapsed: 0.076824 seconds
Daemoniseand lock all files in a directory into physical memory:
vmtouch -dl /var/www/htdocs/critical/
What other people are saying
People have found lots of uses for vmtouch over the years. Here are a few links in no particular order:
- Admin magazine: Performance Tuning Dojo: Tune-Up
- Techniques for Warming Up a MongoDB Secondary
- Linux Memory Usage
- What a C programmer should know about memory
- Playlists at Spotify – Using Cassandra to store version controlled objects (slide 32)
- Understanding and optimizing Memory utilization
- Tune Up Paging with vmtouch
- Linux Cached Memory
- Of how much of a file is in RAM
- Manipulating the kernel’s page cache with vmtouch
- Memory management in Linux kernel (slide 16)
- Supercomputing on the cheap with Parallella
- System Design and Big Data, chapter 6
- Lucene @ Yelp (slide 16)
- tuxdiary: vmtouch: portable file cache analyzer
- Linux kernel mailing list: zcache: Support zero-filled pages more efficiently
- comp.db.sqlite.general: Strange eviction from Linux page cache
- Emacs speed up 1000%
- Jolla Review: Some Rough Edges, But This Linux Smartphone Shows Promise (vmtouch deployed on maemo phones?)
- ceph-users: Ceph SSD array with Intel DC S3500’s
- proxmox forums: CPU Performance Degradtion
- Argonne National Laboratory’s Advanced Photon Source
- Elastic Search: Dealing with OS page cache evictions?
- Data-center deploy using torrent and mlock()
- Making best use of 512mb Pi with tmpfs
- redis-db: Issue with Redis replication while transferring rdb file from master to slave
- mongodb-user: Oplog Memory Consumption
- CentOS bugtracker: oom killer kills process rather than freeing cache
Discussion about instagram’s usage of vmtouch:
- What Powers Instagram: Hundreds of Instances, Dozens of Technologies
- Instagram Architecture: 14 Million Users, Terabytes Of Photos, 100s Of Instances, Dozens Of Technologies
- The Instagram Architecture Facebook Bought For A Cool Billion Dollars
- parse_vmtouch.py (script used by instagram)
Stack-overflow and friends
- Does the Linux filesystem cache files efficiently?
- Postgresql doesn’t use memory for caching
- MongoDB, NUMA hardware, page faults
- Know programs in cache
- Is it possible to list the files that are cached?
- Tell the linux kernel to put a file in the disk cache?
- Securely wipe an entire Linux server with itself
- Caching/preloading files on Linux into RAM
- Why drop caches in Linux?
- Clear / Flush cached memory
- limit filesystem cache size for specific files under linux
- Memory mapping files for a blazing fast webserver on Linux
- Performance difference between ramfs and tmpfs
- How do I lock a growing directory in memory?
- How do I vmtouch a directory (not the files it contains)? (good question, I don’t know of a userspace way to do this)
- MySQL queries are 10 to 100 times slower after OS reboot
- How can one examine what files are in Linux’s page cache?
There are also lots of mentions on twitter using the #vmtouch hash-tag
Have another link? Pleaselet me know!
vmtouch is copyright (c) 2009-2016 Doug Hoyte and contributors.
Contributors are listed in CHANGES .