Cache Memory (CM) is a special high speed mechanism. It can be either a reserved part of main memory or an independent high speed storage device. In Personal Computers, there are two types of caching are commonly used which are Memory Caching (MC) and disk caching. A memory cache, sometimes called a cache store or RAM cache, is a portion of memory made of high-speed static RAM (SRAM) instead of the slower and cheaper dynamic RAM (DRAM) used for Main Memory. Memory caching is effective because most programs access the same data or instructions over and over. By keeping as much of this information as possible in SRAM, the computer avoids accessing the slower DRAM. Some are built into the architecture of microprocessors.

Disk caching works under the same principle as memory caching, but instead of using high-speed SRAM, a disk cache uses conventional main memory. The most recently accessed data from the disk is stored in a memory buffer. When a program needs to access data from the disk, it first checks the disk cache to see if the data is there. Disk caching can dramatically improve the performance of applications, because accessing a byte of data in RAM can be thousands of times faster than accessing a byte on a hard disk.

For example, Internet connection is the slowest link in computer. So the browser (Internet Explorer, Netscape, Opera, etc.) uses the hard disk to store HTML pages, putting them into a special folder on the disk.
The first time you ask for an HTML page, the browser renders it and a copy of it is also stored on your disk. The next time, on your request to access this page, your browser checks if the date of the file on the Internet is newer than the one cached.
If the date is the same, your browser uses the one on your hard disk instead of downloading it from Internet. In this case, the smaller but faster memory system is your hard disk and the larger and slower one is the Internet.

There are other caches which are;
L2 Cache: If there is some special memory bank in the motherboard which is small but very fast and two times faster than the main memory access. That’s called a level 2 cache or an L2 cache.
L1 cache: If there is smaller but faster memory system directly into the microprocessor’s chip and this memory will be accessed at the speed of the microprocessor and not the speed of the memory bus, That’s an L1 cache.

If the question that “ why can’t we make all memories at the same speed in such away that there is no need of CM?” is rised
The answer would be : “ Yes, memories can be all made at the same speed. But it’s too expensive. To reduce the charges, we have to use a small memory for cache.”

Reference:

  1. Introduction to Computers by Peter Norton (7th Edition)