Talk:Paging

From Wikipedia, the free encyclopedia

	This article is within the scope of Computing WikiProject, an attempt to build a comprehensive and detailed guide to computers and computing. If you would like to participate, you can edit the article attached to this page, or visit the project page, where you can join the project and/or contribute to the discussion.
???	This article has not yet received a rating on the quality scale.
???	This article has not yet received an rating on the importance scale.

It is requested that a diagram or diagrams be included in this article to improve its quality.
For more information, refer to discussion on this page and/or the listing at Wikipedia:Requested images.

1 x86-architecture specific
2 Paging and Page (computer science)
3 Removed wrong statement
4 re-wrote entire article
5 Definition of terms
6 Hockey stick
7 Who was first
8 Last paragraph
9 Virtual Memory VS Physical memory
10 Paging vs. swapping
11 merge

[edit] x86-architecture specific

Clearly there are portions of this article that assume the x86 architecture is understood. I say, remove these architecture-specific parts as the concepts in the article apply universally.

I've removed the "only on 286" and "only on 386" bits and replaced with a 'for example' note for Intel. Could probably be removed entirely. The example looks like an Intel system, but since it's an only example that can probably stay. - Lady Lysine Ikinsile 12:19, Jun 9, 2004 (UTC)

[edit] Paging and Page (computer science)

I've separated out references to Paging and Page (computer science). I've also basically gotten rid of references to memory paging, replacing them with Page (computer science).

For now, Page (computer science) is a redirect to Paging, but I've kept them separate so that someone in the future can make a proper article for Page (computer science).

Anyone feel like starting on the article Page (computer science)?

SEEM'S REALLY FINE NOW.

[edit] Removed wrong statement

I edited out the following line:

(in the Intel x86 family, for example, only i386 and higher CPUs possess MMUs)

That's wrong. The 286 lacked support for paging (used segmentation instead), but it did have an MMU.

[edit] re-wrote entire article

The original article mixed in discussion of advantages and disadvantages that really applied to a discussion of operating system design, not the paging system. For example, one disadvantage listed stated that inter task communication was more difficult. This would only be the case if two tasks running in different address spaces wanted to share data in RAM.

Address space isolation is usually restricted to different users sharing the same computer, not two tasks run by the same user. A database system might have a task reading a file and another writing data from the file to the database. These two tasks would run in the same address space and would be able to see the entire RAM addressable by each other.

Large commercial installations will have many users sharing the computer simultaneously. In this case, each user would have its own address space. A task run by one user could not see the memory used by a task run by another user - it is totally invisible to them. A task can not generate a page fault trying to address memory used by a different address space; it simply can't do it.

--Eric 14:13, 11 February 2007 (UTC)

"Address space isolation is usually restricted to different users sharing the same computer, not two tasks run by the same user."

Huh?! Separate processes always run in their own address spaces, now matter which user runs them. Applications might decide to use shared memory for communication; however, they will only share some mappings in that case, and not the entire address space. -- intgr 15:21, 11 February 2007 (UTC)

As I said earlier, a database system that has two tasks or proceses, one that reads from a file and one that writes to a database will run each in the same address space. They will not need to use shared memory. Operating systems and many applications could not run if all their sub tasks each had their own address space. It really depends on the operating system though, so which operating system are you referring to? Many systems will create an address space for each application, so there are many ways to mix and match. In most cases, memory protection between tasks is enough security.

I am not sure about a JAVA virtual machine running under windows; are you familiar with that?.--Eric 16:12, 11 February 2007 (UTC)

Yes, Java can spawn separate virtual machines for a single user — and even if it didn't, it would be some trickery within the JVM, and irrelevant to the operating system. -- intgr 18:59, 11 February 2007 (UTC)

I've only worked on two operating systems myself and neither would provide discrete address space to a subtask: there is too much overhead in updating the page tables every time there is a context switch between tasks that are tightly coupled. I am not sure where we are going with this discussion or if you agree with me. I am reacting to "Separate processes always run in their own address spaces" Perhaps we are just discussing semantics: processes, tasks, users ... Thoughts? —The preceding unsigned comment was added by Sailorman2003 (talk • contribs) 19:26, 11 February 2007 (UTC).

Sorry, I missed your other earlier comment due to the confusing indentation.

"As I said earlier, a database system that has two tasks or proceses"

Perhaps I come from a different background than you, but as far as I can tell, the term "task" is never solidly defined. I realize now that you were referring to threads, and not processes. Yes, separate threads under a single process do typically share the memory space (such as the DBMS server/daemon process). However, not all DBMSes use threads — the PostgreSQL server, for example, forks new processes instead of threads. Or with Berkeley DB/SQLite, there is no server at all; database updates are done within the address space of the calling process, and concurrent processes accessing the same database use memory-mapped files or shared memory for concurrency control.

"so which operating system are you referring to?"

Pretty much all the mainstream ones: Unices, Linux (since it got the clone() syscall), *BSD, Mac OS X, Windows NT, Windows 9x.

"there is too much overhead in updating the page tables every time there is a context switch between tasks that are tightly coupled."

"Tightly coupled"? You mean message-passing between threads? How many DBMSes use synchronous message-passing for communication? Normally, performance-critical or heavily concurrent code handles concurrency with locking and occasional semaphores, to keep the number of context switches at minimum, and to increase memory locality and concurrency on SMP computers (at the cost of complexity). While graphical user interfaces normally use message passing, usually one thread is processing all GUI messages within a single process — again, no context switching.

As far as I can tell, there is no difference from the kernel level, whether it's switching contexts between different processes, or threads within a single process. However, the old Linux threading library (before the clone() syscall appeared), and a few programming language interpreters, do implement concurrency at a higher level, dividing a single process and a single OS thread into several logical threads, managing the "context switching" in user space, so that indeed no context switches are actually made, and no updates to page tables are needed. Sometimes, this approach is called "microthreading" (though this term also has other meanings). This is, however, increasingly rare these days. The standard Java and Python interpreters, for example, use native OS threads where available. But indeed, microthreads do yield better performance under specific workloads, for the reason you pointed out.

-- intgr 22:32, 11 February 2007 (UTC)

My operating experience is with VM, IBM's multi user operating system and it was 15 years ago. That said, it was then and probably is now one of the most powerful multi user operating systems available.

After reading your latest comments, I think we are mostly in agreement about when there is a context switch and when not.

VM ran the kernel in real addressing mode and the code resided in low memory. The rest of the operating system resided in a shared segment that was mapped into every users virtual machine and ran within the user's address space.

DBMS systems ran partly in the user's virtual machine and partly in their own vitual machine each with their own discrete address space. The user virtual machine ran the code that created schemas, parsed SQL statements, generated stored procedures...

The DBMS virtual machine did the query optimization, ran the queries that materialized the set and ran any necessary stored procedures. It ran entirely in one address space. It handled concurrency as you suggested with semaphores. Communication between the user portion of the DBMS and the DBMS virtual machine was through message passing as I believe is the case with windows.

I don't have an in depth knowledge of UNIX, but I believe it does essentially the same thing.--Eric 18:38, 15 February 2007 (UTC)

Seems like this confusion is caused purely due to a difference in terminology. :)

Conceptually, the Java VM has very little in common with the term "virtual machine" in VM-based operating systems.

Unix doesn't have a concept of virtual machines. Essentially, the VM approach seems to imply a strict process hierarchy of OS→user→process or OS→VM→process, while Unices have a "flat" process model, followed by threads on the lower level: OS→process→thread. "User" is just an attribute of the process. A process can interact with other processes only through syscalls (calls to the kernel), no matter who owns the processes. Though syscalls can also be used to set up shared memory regions.

So if I understood you correctly: in this OS, all processes in a single VM shared the address space; interaction between separate VMs had to go through the kernel. In this case, the VMs are analogous to processes in Unix, and VM processes are analogous to OS threads in Unix. Switches between separate threads of a single Unix process don't go through a context switch either (though schedulers do not seem to optimize for this due to fairness considerations)

"Communication between the user portion of the DBMS and the DBMS virtual machine was through message passing as I believe is the case with windows."

Communication between the server and client is indeed done over sockets, most likely implementing a custom messaging protocol. When I said they use locks/semaphores, I meant concurrency within the DBMS server. -- intgr 10:46, 22 February 2007 (UTC)

Thanks for the clarification; now I think we are on the same page:). --Eric 19:28, 22 February 2007 (UTC)

"so which operating system are you referring to?"

Pretty much all the mainstream ones: Unices, Linux (since it got the clone() syscall), *BSD, Mac OS X, Windows NT, Windows 9x.

And also VMS. Jeh (talk) 16:15, 20 February 2008 (UTC)

"there is too much overhead in updating the page tables every time there is a context switch between tasks that are tightly coupled."

On most modern processors (and the VAX, and even x86 ;) ) the "overhead" amounts to loading a single register -- e.g. CR3 on x86 -- with the base address of the first-level page table (each process has a different one). To be sure there is some slight additional cost associated with this, as the non-global entries in the translation cache have to be invalidated... but there are not that many of them in the first place and so the cache gets repopulated quickly. A process-to-process (that is, address space to address space) context switch is therefore only marginally more costly than an intra-process, thread-to-thread context switch. Jeh (talk) 16:15, 20 February 2008 (UTC)

[edit] Definition of terms

Given two ratios: 2:10 and 8:10, which is high and whicn is low? —The preceding unsigned comment was added by Sailorman2003 (talk • contribs) 14:51, 3 March 2007 (UTC).

[edit] Hockey stick

"the graph will look like a hockey stick"? —The preceding unsigned comment was added by 71.202.164.218 (talk)

[edit] Who was first

Jacek Karpinski claimed the K-202 (his computer) was the first to use the paging. Was there any earlier one? Szopen 06:02, 21 May 2007 (UTC)

Err, umm, the IBM System/360 Model 67 and the GE 645, to name two commercial machines? Or, earlier, the IBM M44/44X and the Manchester/Ferranti Atlas? 1971 is a bit late to be claiming to have pioneered paged virtual memory; the Atlas was decommissioned in 1971. Guy Harris 18:21, 18 September 2007 (UTC)

Or perhaps they meant "first minicomputer to use paging"? I've updated the article to say that. Guy Harris 18:27, 18 September 2007 (UTC)

I would say it's a hoa^W unverifiable information. The K-202 is some kind of an urban legend here in Poland. Various press articles contain contradictory technical specs, and I am not aware of any proper sources (yet). --Kubanczyk 20:53, 19 September 2007 (UTC)

I still think it might be interesting to check why k-202 had 8MB of memory whilst for example SuperNova had only 64KB or something like that. —Preceding unsigned comment added by 81.168.138.71 (talk) 17:50, 28 November 2007 (UTC)

The claim about 8 MB of memory is blatantly false, please provide reliable sources, as required by Wikipedia policy WP:RS. The K-202's marketing information vaguely mentioned that it may scale up to 8 MB, but this was theoretical possibility never verified in practice. --Kubanczyk (talk) 10:18, 20 February 2008 (UTC)

Kubanczyk, the Karpinski himself said it used "paging". http://www.historycy.org/index.php?showtopic=33075&pid=274657&st=0 (from your nickname, I presume you know Polish)

"Zastosowałem tam kompletną nowość - powiększenie pojemności pamięci przez adresowanie stronicowe. To mój wynalazek. W Londynie, na wystawie w Olimpii, stały obok siebie: brytyjski Modular One, amerykańskie maszyny i K-202 - wszystkie 16-bitowe. I wszystkie miały 64 kilo pamięci, a K-202 - 8 mega! Wszyscy pytali, jak to zrobiłem. Odpowiadałem, że zrobiłem i, jak widać, działa. W jakiś czas później przyjechał do mnie do Warszawy konstruktor z CDC22. To była wtedy jedna z największych amerykańskich firm komputerowych. Chciał się dowiedzieć, jak dokonałem cudu. Powiedziałem mu, żeby się domyślił, bo to bardzo proste. Myślał dwa dni - i nic. To mu w końcu powiedziałem. Potem przyjechał następny inżynier z DEC23. Też mu powiedziałem."

Szopen (talk) 13:21, 26 May 2008 (UTC)

What troubles me is that Karpiński lied in this interview. Not only this is unreliable, not only original research, but first of all it contains at least one big lie. He said "I used a complete novelty (...) paging." when referring to his computer from 1970s. In fact virtual memory using paging was first introduced 10 years before in the Atlas Computer, and in 1970s it was included in S/370 line, probably the most popular mainframe at the time. So it was a widely, publicly known invention by 1970! Karpiński might be the first one to use this concept in a minicomputer (as opposed to mainframe computer), but this is yet to be proven by a third-party source. After such statement, contemporary Karpiński's info should never be cited as a source here. --Kubanczyk (talk) 14:01, 26 May 2008 (UTC)

[edit] Last paragraph

If one graphs the number of faults on one axis and working set size on the other, the graph will look like a hockey stick. Why don't we get an actual graph or pseudograph instead? I had a difficulty imagining that, because I imagined the "hockey stick" as pointing rightwards instead of pointing leftwards. For now, I'm just going to edit that so it reads "will look like a leftward pointing hockey stick." Miggyb 05:33, 24 August 2007 (UTC)

Use the {{reqdiagram}} template on talk pages; maybe someone can create it for us. -- intgr ^#%@! 23:06, 25 August 2007 (UTC)

[edit] Virtual Memory VS Physical memory

Since the bus speed of physical memory is far greater than that of virtual memory stored on hard-disks, why is virtual memory used long before physical memory is used. For example I have 1.5GB of RAM, yet only 512MB of that is used, whilest at the same time 300-500MB of virtual memory is used. It's NOT more effeciant because of the difference in speed, so why isn't most of my ram being used?

PS: I'm using Windows XP

anon 16:48 GMT 09/09/2007

Good question, bad place - ask a Windows expert... Or just don't worry, it's only Windows after all. --Kubanczyk 20:55, 19 September 2007 (UTC)

Really, it is not a case of "virtual is used before physical" at all. This confusion derives from Windows' misleading displays and nomenclature. The best reference for this would be Windows Internals by Russinovich and Solomon. Or you could ask in the forums at arstechnica.com. Jeh (talk) 01:32, 8 January 2008 (UTC)

[edit] Paging vs. swapping

I'm afraid that this article is once again blurring the term "swapping" (like virtual memory once did). Paging can mean two different things:

In computer architecture, a technique for implementing virtual memory, where the virtual address space is divided into fixed-sized blocks called pages, each of which can be mapped onto any physical addresses available on the system.
In operating systems it's the act of managing disk-backed data in main memory. The terms "paging in" and "paging out" respectively refer to loading and dropping of disk-backed pages from main memory (usually the page cache, but also applied to swap space — I am not sure whether this is correct usage or not).

However, the term "swapping" is only applied to moving dynamic application memory from and to a pre-allocated pool in secondary storage, not memory that is already backed on the disk. From the practical perspective, one difference is that paging out disk-backed pages only involves writing if the page is marked dirty (in the general case, it's just read-only executable code). Swapping something out always involves writing to the disk.

Does this make sense? -- intgr [talk] 11:04, 5 October 2007 (UTC)

I've tried to point out meaning 1 in the article, hmmm have I failed at this? It is not WP:COMMONNAME now, so it's only a side note. About meaning 2, I don't know, but please be very, very careful to stay with the reliable sources. Provide your own sources. Let the article be NPOV (not only Unix-like terminology). This is a controversial topic, so in a second we could have some people here in a Linux/Windows war (again). --Kubanczyk 12:40, 5 October 2007 (UTC)

"I've tried to point out meaning 1 in the article, hmmm have I failed at this?"

No, this is not what I intended to say.

I can live with both concepts explained on this page, for now. My immediate concern is that this article should not state that "paging" the same thing as "swapping"; these terms are used interchangeably in some parts of the article, especially these two quotes: "paging, sometimes called swapping" in the lead section, and "Most Unix-like systems, including Linux, use term swapping instead of paging".

All modern Unix-like OSes implement paging for disk files that are cached or buffered in RAM, just like Windows (per my second definition). The difference is that Windows does not have a distinct term for "swap" (in their user interfaces anyway), so they arbitrarily say "virtual memory" or "paging" instead. (Windows 3.x did actually call the swap file "WIN386.SWP", go figure). To put it simply, what I'm saying is that Windows's meaning of "paging" is a superset of the Unix meanings of "paging" and "swapping".

Can you agree with me this far? If not, what in particular do I provide sources for? -- intgr [talk] 18:09, 5 October 2007 (UTC)

Well, alternate sources are definately required if you want to change some well-sourced sentences. But I hope that's obvious. I mean lead section here.

For the Windows part I agree.

For Unix-like: "Most Unix-like systems, including Linux, use term swapping instead of paging" - this I've got wrong, I admit. However, not being an expert, my experience is that paging=pagingS+caching. PagingS==swapping. Do you state otherwise: paging==caching? In such case I think you should provide a source too. --Kubanczyk 22:00, 5 October 2007 (UTC)

Sorry, I haven't found the time to do any useful work on this article, but I just noticed another awkward claim creep in: "there is a concept known as page cache, of using the same single mechanism for both virtual memory and disk caching."

I thought we already established that swapping is not the same as virtual memory? And page cache is in fact not related to swapping; you may or may not say that swapping and paging are related, but the page cache keeps track of the cache of pages that are already present on the disk with copies in the main memory. Obviously "swappable" memory is not already present on the disk, hence why it needs a preallocated storage space. And once it's written to the disk, it is deallocated from the main memory. -- intgr [talk] 01:33, 19 October 2007 (UTC)

Duh, in fact I'm somewhat proud of this sentence and I'll defend it. First of all, forget about swapping - it mentions "virtual memory mechanism", a term used purposely and in the purest sense. In a machine with virtual memory and all the mechanisms that support it, you could still use any other disk cache mechanism to utilize your free memory. A page cache is very special in this sense, that it re-uses exactly the same mechanism as virtual memory does: the same (or very similar) kernel's code that did swapping can now perform paging, the same mechanism that implemented "VM address translation" can now implement mmap(), you have very similar page fault procedure in normal and mmaped memory (either read a swap file or a nomal file, respectively). That is what I was trying to express.

Going further, if I understand correctly, you suggest that the main difference is that page cache is read-only cache, while swappable memory is read-write one. Next you state that normal resident memory cannot have a copy already swapped out. I think those definitions would be too narrow: there is nothing inherently wrong in having a page in a disk cache that has (yet) no copy on the disk; there is nothing wrong with pre-fetching a page from swap space for no apparent reason, just for the pleasure of keeping a non-dirty copy in the memory (like ck does). --Kubanczyk 18:43, 19 October 2007 (UTC)

I find this article confusing. All references to swapping should be removed from the main article. Swapping is a method of workload control, i.e. the operating system determines that the resources required by this task (CPU cycles, RAM, whatever) would be more productive if it were freed and applied to another task. Therefore, the workload will be stopped and its RAM contents copied to another media until resources are available for its continuation.

Swapping does not require paging but may be easier to implement if pages exist.

MVS/ESA operating systems (and follow-on versions) implemented swapping to RAM storage that was multi-processor locked at the page level instead of the byte level OR to disk.
The DEC RT-11 operating system permitted swapping to any block structured device including DECtape. RT-11 did not support pages and could swap a swappable area of arbitrary length from any location.

Paging is a mechanism for resource allocation, i.e. this collection of work needs more real RAM than is available. The operating system will provide a mechanism to allocate real RAM to active workloads and withdraw real RAM from less active or less important workloads. Ccalvin 13:19, 21 October 2007 (UTC)

"Swapping is a method of workload control" - I would like to point out, that it is a valid definition of the term, but a very old one. The second definition is somewhat more common now as it is used in Unix-likes, including Linux. Now, per general Wikipedia rules, both meanings should be stated here. I tried to do it, honestly. Feel free to expand on the more traditional meaning (it is there already). But, do not remove "all references to swapping", especially if it is a well sourced material.

Your paging definition is unclear, probably wrong. Please provide sources. --Kubanczyk 20:10, 21 October 2007 (UTC)

I concur with the opinion that this difference needs to be clearly delineated in the text; it is an error to say ~"well, they're pretty much the same", even though some contexts do certainly use the terms interchangeably...

One point of differentiation is that paging systems are going to work hand-in-hand with address translation. This is because paging is triggered by "page faults", which in turn are triggered by lack of a "valid" bit in the virtual page's page table entry... which will not be in use in the first place if address translation is not in effect. Swapping otoh can most certainly occur without page tables being enabled...

Incidently VMS (now "OpenVMS", if you must insist) uses both paging and swapping: Swapping applies to entire working sets at a time, while paging is done on a page by page basis -- only, of course, to pages within an inswapped working set...

The NT family does not do whole-process "swapping" exactly, but implements a similar net result: Its "Balance set manager", at times of extreme memory pressure, will gradually reduce the limit on an idle process's working set size to nearly zero, thereby forcing it to lose nearly all of its pages (other than shared pages still in other working sets) to the standby and modified page lists, much as in normal w.s. replacement. Thus only modified pages need be written to backing store, unlike a traditional outswap. There is no whole-working-set inswap either; the limit is simply set back to a larger value and the process is allowed to demand-page what it needs back into its working set. All of this is a lot less work for the disk subsystem than a traditional whole-process outswap/inswap would be. ...

I believe the example of OpenVMS, and related functionality in Windows, highlights the need to differentiate the terms. One could at least say "where an operating system uses both mechanisms, paging means x and swapping means y." No? Jeh (talk) 01:46, 8 January 2008 (UTC)

[edit] merge

I suggest merging the "demand paging" article into a section of "paging" article.

Most of the "demand paging" article repeats information already covered in the "paging" article, so this merge will make the "paging" article a few sentences longer. Merging would also make it easier to compare and contrast demand paging with the alternatives. --68.0.124.33 (talk) 15:35, 3 April 2008 (UTC)

Oppose - I would say that the best thing about "Demand paging" is that it is already a nice Wikipedia:Summary style article, as it does not repeat information from this one. Could you quote what information is repeated, as we seem to disagree? Also demand paging is pretty advanced idea if compared to traditional paging, I would not want it to vanish somewhere in a big article. --Kubanczyk (talk) 16:46, 3 April 2008 (UTC)

I am confused.

Wikipedia:Summary style says that "Summary sections are linked to the detailed article with a {{main|<name of detailed article>}} or comparable template".

I don't see where the "demand paging" article links to any more detailed article in that way.

Are you alluding to some other article that links to the more detailed "demand paging" article in some summary style? I don't see that, either.

Are you suggesting that some other article -- paging or virtual memory or some other article -- ought to link to the more detailed "demand paging" article in a summary style? That may be a good idea.

--68.0.124.33 (talk) 23:13, 3 April 2008 (UTC)

Yep, "Paging" ought to link to more detailed "Demand paging" in a summary style. That may be a good idea. --Kubanczyk (talk) 12:33, 4 April 2008 (UTC)

OK, done. (Plus I tried to fill in the other alternatives to "demand paging" -- did I miss any?) --68.0.124.33 (talk) 03:43, 9 April 2008 (UTC)