linux

Author	SHA1	Message	Date
Michael Krufky	a3497135d8	V4L/DVB (5649): Umt-010: convert tua6034 handling to properly use dvb-pll Signed-off-by: Michael Krufky <mkrufky@linuxtv.org> Signed-off-by: Mauro Carvalho Chehab <mchehab@infradead.org>	2007-07-18 14:23:23 -03:00
Michael Krufky	8906939252	V4L/DVB (5648): Dvb/frontends: remove unnecessary #include's of "dvb-pll.h" These sources do not need to #include "dvb-pll.h" Signed-off-by: Michael Krufky <mkrufky@linuxtv.org> Signed-off-by: Mauro Carvalho Chehab <mchehab@infradead.org>	2007-07-18 14:23:22 -03:00
Tony Wan	480f75acbc	V4L/DVB (5647): Saa7134: enable ir-remote for 10moons TM300 Using Encore's key codes, we needn't add any additional key table. Signed-off-by: Tony Wan <wankai@sjtu.org> Signed-off-by: Mauro Carvalho Chehab <mchehab@infradead.org>	2007-07-18 14:23:21 -03:00
Tony Wan	aaccb82bdb	V4L/DVB (5646): V4l: saa7134: add support for 10moons TM300 card Support the 10moons TM300 TV card (so called TV Master 3), which is a 10moons saa7130 based board. Here not include features for the IR-remote. It has been tested using TVTIME. The card was auto-detected and all the input sources worked correct with sound. Signed-off-by: Tony Wan <wankai@sjtu.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Mauro Carvalho Chehab <mchehab@infradead.org>	2007-07-18 14:23:20 -03:00
Thierry MERLE	c5f48367fe	V4L/DVB (5644): Usbvision: video_ioctl2 conversion The ioctl entry point, a big switch/case, is splitted in little functions. These functions are set as callbacks for the video_ioctl2 video4linux facility. This improves the driver memory consumption and enables the v4l1 compatibility as a side effect. Signed-off-by: Thierry MERLE <thierry.merle@free.fr> Acked-by: Dwaine P. Garden <dwainegarden@rogers.com> Signed-off-by: Mauro Carvalho Chehab <mchehab@infradead.org>	2007-07-18 14:23:20 -03:00
Thierry MERLE	ea1f83cee9	V4L/DVB (5643): Usbvision: make common video and radio ioctls Radio and video ioctls are the same, delete the usbvision_do_radio_ioctl function add the special cases for radio in usbvision_v4l2_do_ioctl Signed-off-by: Thierry MERLE <thierry.merle@free.fr> Signed-off-by: Mauro Carvalho Chehab <mchehab@infradead.org>	2007-07-18 14:23:19 -03:00
David Warman	8c7189d193	V4L/DVB (5642): add comment that VO_MODE is also being set. usbvision_set_video_format: add comment that VO_MODE is also being set. Signed-off-by: David Warman <dwarman@davidwarman.net> Signed-off-by: Thierry MERLE <thierry.merle@free.fr> Signed-off-by: Mauro Carvalho Chehab <mchehab@infradead.org>	2007-07-18 14:23:17 -03:00
David Warman	9fe01e5c29	V4L/DVB (5641): change VideoNorm to NTSC for Belkin USB Videobus II Signed-off-by: David Warman <dwarman@davidwarman.net> Signed-off-by: Thierry MERLE <thierry.merle@free.fr> Signed-off-by: Mauro Carvalho Chehab <mchehab@infradead.org>	2007-07-18 14:23:17 -03:00
Michael Krufky	9b98fd28b4	V4L/DVB (5635): Budget-av: convert philips sd1878 / tda8261 to use dvb-pll removed philips_sd1878_tda8261_tuner_set_params, using dvb_pll_attach, instead. Signed-off-by: Michael Krufky <mkrufky@linuxtv.org> Signed-off-by: Mauro Carvalho Chehab <mchehab@infradead.org>	2007-07-18 14:23:16 -03:00
Michael Krufky	8511df9ec2	V4L/DVB (5634): Saa7134-dvb: convert philips td1316 handling to use dvb-pll removed mt352_aver777_tuner_calc_regs, using dvb_pll_attach, instead. Signed-off-by: Michael Krufky <mkrufky@linuxtv.org> Signed-off-by: Mauro Carvalho Chehab <mchehab@infradead.org>	2007-07-18 14:23:15 -03:00
Michael Krufky	4abe9f9d94	V4L/DVB (5633): Tuv1236d: move rf input switching code into dvb-pll This patch removes duplicate code from cx88-dvb and saa7134-dvb that handles rf input switching for the TUV1236d tuner. The functionality is added to dvb-pll, where all the other code that handles the TUV1236d is kept. Signed-off-by: Michael Krufky <mkrufky@linuxtv.org> Signed-off-by: Trent Piepho <xyzzy@speakeasy.org> Signed-off-by: Mauro Carvalho Chehab <mchehab@infradead.org>	2007-07-18 14:23:14 -03:00
Michael Krufky	77d6750470	V4L/DVB (5632): Dvb-pll: pass dvb_frontend_parameters to generic set() function Rename dvb_pll_desc.setbw() to set(), and accept struct dvb_frontend_parameters instead of passing both freq and bandwidth, so that this may be used as a generic function. In order to do this, dvb_pll_configure must also be altered in the same manner, to take struct dvb_frontend_parameters instead of freq and bandwidth. Signed-off-by: Michael Krufky <mkrufky@linuxtv.org> Signed-off-by: Trent Piepho <xyzzy@speakeasy.org> Signed-off-by: Mauro Carvalho Chehab <mchehab@infradead.org>	2007-07-18 14:23:12 -03:00
Simon Arlott	900858ecb3	V4L/DVB (5631): Dvb-core: Add level fixes to printk()s, plus spelling/grammer All the printks had missing level prefixes so I've fixed these too. Also fixed some grammer errors. Signed-off-by: Simon Arlott <simon@fire.lp0.eu> Signed-off-by: Trent Piepho <xyzzy@speakeasy.org> Signed-off-by: Mauro Carvalho Chehab <mchehab@infradead.org>	2007-07-18 14:23:12 -03:00
Trent Piepho	ecf854df72	V4L/DVB (5629): Cx88: VP3054 support can't be a module when cx88 is compiled in If cx88 support is compiled into the kernel while vp3054 is left as a module, the kernel will fail to link. Adjust the existing "#if" code in cx88 so that it won't consider vp3054 to be supported in this case. It might make sense to move vp3054 selection into the "customisation" menu instead of a cx88 sub-option (though this is a cx88 feature, there is no extra chip involved). It might also make sense to use dvb_attach() to load vp3054 support. Signed-off-by: Trent Piepho <xyzzy@speakeasy.org> Signed-off-by: Mauro Carvalho Chehab <mchehab@infradead.org>	2007-07-18 14:23:11 -03:00
Mauro Carvalho Chehab	8573a9e6a8	V4L/DVB (5563a): Add experimental support for tea5761 tuner This driver were made based on tea5761 specs. Signed-off-by: Mauro Carvalho Chehab <mchehab@infradead.org>	2007-07-18 14:23:11 -03:00
Linus Torvalds	5cc97bf2d8	Merge branch 'xen-upstream' of ssh://master.kernel.org/pub/scm/linux/kernel/git/jeremy/xen * 'xen-upstream' of ssh://master.kernel.org/pub/scm/linux/kernel/git/jeremy/xen: (44 commits) xen: disable all non-virtual drivers xen: use iret directly when possible xen: suppress abs symbol warnings for unused reloc pointers xen: Attempt to patch inline versions of common operations xen: Place vcpu_info structure into per-cpu memory xen: handle external requests for shutdown, reboot and sysrq xen: machine operations xen: add virtual network device driver xen: add virtual block device driver. xen: add the Xenbus sysfs and virtual device hotplug driver xen: Add grant table support xen: use the hvc console infrastructure for Xen console xen: hack to prevent bad segment register reload xen: lazy-mmu operations xen: Add support for preemption xen: SMP guest support xen: Implement sched_clock xen: Account for stolen time xen: ignore RW mapping of RO pages in pagetable_init xen: Complete pagetable pinning ...	2007-07-18 10:18:39 -07:00
Tony Breeds	826ea8f22c	Revert "[POWERPC] Do firmware feature fixups after features are initialised" This reverts commit `5a26f6bbb7`. The original patch causes boot failures when built with ppc64_defconfig. The quickest fix is to revert it while alterates are investigated. Signed-off-by: Tony Breeds <tony@bakeyournoodle.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-07-18 10:17:39 -07:00
Tony Breeds	4f3731da16	Fix compile failure in arch/powerpc/kernel/pci-common.c This fixes the fallout from the recent powerpc merge (commit `489de30259`): CC arch/powerpc/kernel/pci-common.o arch/powerpc/kernel/pci-common.c:160: error: conflicting types for 'pcibios_add_platform_entries' include/linux/pci.h:889: error: previous declaration of 'pcibios_add_platform_entries' was here Signed-off-by: Tony Breeds <tony@bakeyournoodle.com> Tested-by: Bret Towe <magnade@gmail.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-07-18 10:17:39 -07:00
Jeremy Fitzhardinge	dfdcdd42fd	xen: disable all non-virtual drivers A domU Xen environment has no non-virtual drivers, so make sure they're all disabled at once. Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com> Cc: Rusty Russell <rusty@rustcorp.com.au>	2007-07-18 08:47:46 -07:00
Jeremy Fitzhardinge	9ec2b804e0	xen: use iret directly when possible Most of the time we can simply use the iret instruction to exit the kernel, rather than having to use the iret hypercall - the only exception is if we're returning into vm86 mode, or from delivering an NMI (which we don't support yet). When running native, iret has the behaviour of testing for a pending interrupt atomically with re-enabling interrupts. Unfortunately there's no way to do this with Xen, so there's a window in which we could get a recursive exception after enabling events but before actually returning to userspace. This causes a problem: if the nested interrupt causes one of the task's TIF_WORK_MASK flags to be set, they will not be checked again before returning to userspace. This means that pending work may be left pending indefinitely, until the process enters and leaves the kernel again. The net effect is that a pending signal or reschedule event could be delayed for an unbounded amount of time. To deal with this, the xen event upcall handler checks to see if the EIP is within the critical section of the iret code, after events are (potentially) enabled up to the iret itself. If its within this range, it calls the iret critical section fixup, which adjusts the stack to deal with any unrestored registers, and then shifts the stack frame up to replace the previous invocation. Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com>	2007-07-18 08:47:46 -07:00
Jeremy Fitzhardinge	600b2fc242	xen: suppress abs symbol warnings for unused reloc pointers arch/i386/xen/xen-asm.S defines some small pieces of code which are used to implement a few paravirt_ops. They're designed so they can be used either in-place, or be inline patched into their callsites if there's enough space. Some of those operations need to make calls out (specifically, if you re-enable events [interrupts], and there's a pending event at that time). These calls need the call instruction to be relocated if the code is patched inline. In this case xen_foo_reloc is a section-relative symbol which points to xen_foo's required relocation. Other operations have no need of a relocation, and so their corresponding xen_bar_reloc is absolute 0. These are the cases which are triggering the warning. This patch adds those symbols to the list of safe abs symbols. Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com> Cc: Adrian Bunk <bunk@stusta.de>	2007-07-18 08:47:45 -07:00
Jeremy Fitzhardinge	6487673b8a	xen: Attempt to patch inline versions of common operations This patchs adds the mechanism to allow us to patch inline versions of common operations. The implementations of the direct-access versions save_fl, restore_fl, irq_enable and irq_disable are now in assembler, and the same code is used for both out of line and inline uses. Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com> Cc: Chris Wright <chrisw@sous-sol.org> Cc: Keir Fraser <keir@xensource.com>	2007-07-18 08:47:45 -07:00
Jeremy Fitzhardinge	60223a326f	xen: Place vcpu_info structure into per-cpu memory An experimental patch for Xen allows guests to place their vcpu_info structs anywhere. We try to use this to place the vcpu_info into the PDA, which allows direct access. If this works, then switch to using direct access operations for irq_enable, disable, save_fl and restore_fl. Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com> Cc: Chris Wright <chrisw@sous-sol.org> Cc: Keir Fraser <keir@xensource.com>	2007-07-18 08:47:45 -07:00
Jeremy Fitzhardinge	3e2b8fbeec	xen: handle external requests for shutdown, reboot and sysrq The guest domain can be asked to shutdown or reboot itself, or have a sysrq key injected, via xenbus. This patch adds a watcher for those events, and does the appropriate action. Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com> Cc: Chris Wright <chrisw@sous-sol.org>	2007-07-18 08:47:45 -07:00
Jeremy Fitzhardinge	fefa629abe	xen: machine operations Make the appropriate hypercalls to halt and reboot the virtual machine. Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com> Acked-by: Chris Wright <chrisw@sous-sol.org>	2007-07-18 08:47:45 -07:00
Jeremy Fitzhardinge	0d16021196	xen: add virtual network device driver The network device frontend driver allows the kernel to access network devices exported exported by a virtual machine containing a physical network device driver. Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com> Signed-off-by: Chris Wright <chrisw@sous-sol.org> Acked-by: Jeff Garzik <jeff@garzik.org> Cc: Ian Pratt <ian.pratt@xensource.com> Cc: Christian Limpach <Christian.Limpach@cl.cam.ac.uk> Cc: Stephen Hemminger <shemminger@linux-foundation.org> Cc: Christoph Hellwig <hch@infradead.org> Cc: Rusty Russell <rusty@rustcorp.com.au> Cc: Herbert Xu <herbert@gondor.apana.org.au> Cc: Keir Fraser <Keir.Fraser@cl.cam.ac.uk> Cc: netdev@vger.kernel.org	2007-07-18 08:47:45 -07:00
Jeremy Fitzhardinge	9f27ee5950	xen: add virtual block device driver. The block device frontend driver allows the kernel to access block devices exported exported by a virtual machine containing a physical block device driver. Signed-off-by: Ian Pratt <ian.pratt@xensource.com> Signed-off-by: Christian Limpach <Christian.Limpach@cl.cam.ac.uk> Signed-off-by: Chris Wright <chrisw@sous-sol.org> Cc: Arjan van de Ven <arjan@infradead.org> Cc: Greg KH <greg@kroah.com> Cc: Jens Axboe <axboe@kernel.dk>	2007-07-18 08:47:45 -07:00
Jeremy Fitzhardinge	4bac07c993	xen: add the Xenbus sysfs and virtual device hotplug driver This communicates with the machine control software via a registry residing in a controlling virtual machine. This allows dynamic creation, destruction and modification of virtual device configurations (network devices, block devices and CPUS, to name some examples). [ Greg, would you mind giving this a review? Thanks -J ] Signed-off-by: Ian Pratt <ian.pratt@xensource.com> Signed-off-by: Christian Limpach <Christian.Limpach@cl.cam.ac.uk> Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com> Signed-off-by: Chris Wright <chrisw@sous-sol.org> Cc: Greg KH <greg@kroah.com>	2007-07-18 08:47:45 -07:00
Jeremy Fitzhardinge	ad9a86121f	xen: Add grant table support Add Xen 'grant table' driver which allows granting of access to selected local memory pages by other virtual machines and, symmetrically, the mapping of remote memory pages which other virtual machines have granted access to. This driver is a prerequisite for many of the Xen virtual device drivers, which grant the 'device driver domain' restricted and temporary access to only those memory pages that are currently involved in I/O operations. Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com> Signed-off-by: Ian Pratt <ian.pratt@xensource.com> Signed-off-by: Christian Limpach <Christian.Limpach@cl.cam.ac.uk> Signed-off-by: Chris Wright <chrisw@sous-sol.org>	2007-07-18 08:47:44 -07:00
Jeremy Fitzhardinge	b536b4b962	xen: use the hvc console infrastructure for Xen console Implement a Xen back-end for hvc console. * * * Add early printk support via hvc console, enable using "earlyprintk=xen" on the kernel command line. From: Gerd Hoffmann <kraxel@suse.de> Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com> Signed-off-by: Chris Wright <chrisw@sous-sol.org> Acked-by: Ingo Molnar <mingo@elte.hu> Acked-by: Olof Johansson <olof@lixom.net>	2007-07-18 08:47:44 -07:00
Jeremy Fitzhardinge	8b84ad942b	xen: hack to prevent bad segment register reload The hypervisor saves and restores the segment registers as part of the state is saves while context switching. If, during a context switch, the next process doesn't use the TLS segments, it invalidates the GDT entry, causing the segment register reload to fault. This fault effectively doubles the cost of a context switch. This patch is a band-aid workaround which clears the usermode %gs after it has been saved for the previous process, but before it gets reloaded for the next, and it avoids having the hypervisor attempt to erroneously reload it. Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com> Signed-off-by: Chris Wright <chrisw@sous-sol.org>	2007-07-18 08:47:44 -07:00
Jeremy Fitzhardinge	d66bf8fcf3	xen: lazy-mmu operations This patch uses the lazy-mmu hooks to batch mmu operations where possible. This is primarily useful for batching operations applied to active pagetables, which happens during mprotect, munmap, mremap and the like (mmap does not do bulk pagetable operations, so it isn't helped). Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com> Acked-by: Chris Wright <chrisw@sous-sol.org>	2007-07-18 08:47:44 -07:00
Jeremy Fitzhardinge	f120f13ea0	xen: Add support for preemption Add Xen support for preemption. This is mostly a cleanup of existing preempt_enable/disable calls, or just comments to explain the current usage. Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com> Signed-off-by: Chris Wright <chrisw@sous-sol.org>	2007-07-18 08:47:44 -07:00
Jeremy Fitzhardinge	f87e4cac4f	xen: SMP guest support This is a fairly straightforward Xen implementation of smp_ops. Xen has its own IPI mechanisms, and has no dependency on any APIC-based IPI. The smp_ops hooks and the flush_tlb_others pv_op allow a Xen guest to avoid all APIC code in arch/i386 (the only apic operation is a single apic_read for the apic version number). One subtle point which needs to be addressed is unpinning pagetables when another cpu may have a lazy tlb reference to the pagetable. Xen will not allow an in-use pagetable to be unpinned, so we must find any other cpus with a reference to the pagetable and get them to shoot down their references. Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com> Signed-off-by: Chris Wright <chrisw@sous-sol.org> Cc: Benjamin LaHaise <bcrl@kvack.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: Andi Kleen <ak@suse.de>	2007-07-18 08:47:44 -07:00
Jeremy Fitzhardinge	ab55028886	xen: Implement sched_clock Implement xen_sched_clock, which returns the number of ns the current vcpu has been actually in an unstolen state (ie, running or blocked, vs runnable-but-not-running, or offline) since boot. Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com> Acked-by: Chris Wright <chrisw@sous-sol.org> Cc: john stultz <johnstul@us.ibm.com>	2007-07-18 08:47:43 -07:00
Jeremy Fitzhardinge	f91a8b447b	xen: Account for stolen time This patch accounts for the time stolen from our VCPUs. Stolen time is time where a vcpu is runnable and could be running, but all available physical CPUs are being used for something else. This accounting gets run on each timer interrupt, just as a way to get it run relatively often, and when interesting things are going on. Stolen time is not really used by much in the kernel; it is reported in /proc/stats, and that's about it. Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com> Acked-by: Chris Wright <chrisw@sous-sol.org> Cc: john stultz <johnstul@us.ibm.com> Cc: Rik van Riel <riel@redhat.com>	2007-07-18 08:47:43 -07:00
Jeremy Fitzhardinge	9a4029fd34	xen: ignore RW mapping of RO pages in pagetable_init When setting up the initial pagetable, which includes mappings of all low physical memory, ignore a mapping which tries to set the RW bit on an RO pte. An RO pte indicates a page which is part of the current pagetable, and so it cannot be allowed to become RW. Once xen_pagetable_setup_done is called, set_pte reverts to its normal behaviour. Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com> Acked-by: Chris Wright <chrisw@sous-sol.org> Cc: ebiederm@xmission.com (Eric W. Biederman)	2007-07-18 08:47:43 -07:00
Jeremy Fitzhardinge	f4f97b3ea9	xen: Complete pagetable pinning Xen requires all active pagetables to be marked read-only. When the base of the pagetable is loaded into %cr3, the hypervisor validates the entire pagetable and only allows the load to proceed if it all checks out. This is pretty slow, so to mitigate this cost Xen has a notion of pinned pagetables. Pinned pagetables are pagetables which are considered to be active even if no processor's cr3 is pointing to is. This means that it must remain read-only and all updates are validated by the hypervisor. This makes context switches much cheaper, because the hypervisor doesn't need to revalidate the pagetable each time. This also adds a new paravirt hook which is called during setup once the zones and memory allocator have been initialized. When the init_mm pagetable is first built, the struct page array does not yet exist, and so there's nowhere to put he init_mm pagetable's PG_pinned flags. Once the zones are initialized and the struct page array exists, we can set the PG_pinned flags for those pages. This patch also adds the Xen support for pte pages allocated out of highmem (highpte) by implementing xen_kmap_atomic_pte. Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com> Signed-off-by: Chris Wright <chrisw@sous-sol.org> Cc: Zach Amsden <zach@vmware.com>	2007-07-18 08:47:43 -07:00
Jeremy Fitzhardinge	c85b04c374	xen: add pinned page flag Add a new definition for PG_owner_priv_1 to define PG_pinned on Xen pagetable pages. Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com> Signed-off-by: Chris Wright <chrisw@sous-sol.org>	2007-07-18 08:47:43 -07:00
Jeremy Fitzhardinge	e738fca8d7	xen: configuration Put config options for Xen after the core pieces are in place. Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com> Signed-off-by: Chris Wright <chrisw@sous-sol.org>	2007-07-18 08:47:43 -07:00
Jeremy Fitzhardinge	15c84731d6	xen: time implementation Xen maintains a base clock which measures nanoseconds since system boot. This is provided to guests via a shared page which contains a base time in ns, a tsc timestamp at that point and tsc frequency parameters. Guests can compute the current time by reading the tsc and using it to extrapolate the current time from the basetime. The hypervisor makes sure that the frequency parameters are updated regularly, paricularly if the tsc changes rate or stops. This is implemented as a clocksource, so the interface to the rest of the kernel is a simple clocksource which simply returns the current time directly in nanoseconds. Xen also provides a simple timer mechanism, which allows a timeout to be set in the future. When that time arrives, a timer event is sent to the guest. There are two timer interfaces: - An old one which also delivers a stream of (unused) ticks at 100Hz, and on the same event, the actual timer events. The 100Hz ticks cause a lot of spurious wakeups, but are basically harmless. - The new timer interface doesn't have the 100Hz ticks, and can also fail if the specified time is in the past. This code presents the Xen timer as a clockevent driver, and uses the new interface by preference. Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com> Signed-off-by: Chris Wright <chrisw@sous-sol.org> Cc: Ingo Molnar <mingo@elte.hu> Cc: Thomas Gleixner <tglx@linutronix.de>	2007-07-18 08:47:43 -07:00
Jeremy Fitzhardinge	e46cdb66c8	xen: event channels Xen implements interrupts in terms of event channels. Each guest domain gets 1024 event channels which can be used for a variety of purposes, such as Xen timer events, inter-domain events, inter-processor events (IPI) or for real hardware IRQs. Within the kernel, we map the event channels to IRQs, and implement the whole interrupt handling using a Xen irq_chip. Rather than setting NR_IRQ to 1024 under PARAVIRT in order to accomodate Xen, we create a dynamic mapping between event channels and IRQs. Ideally, Linux will eventually move towards dynamically allocating per-irq structures, and we can use a 1:1 mapping between event channels and irqs. Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com> Signed-off-by: Chris Wright <chrisw@sous-sol.org> Cc: Ingo Molnar <mingo@elte.hu> Cc: Eric W. Biederman <ebiederm@xmission.com>	2007-07-18 08:47:42 -07:00
Jeremy Fitzhardinge	3b827c1b3a	xen: virtual mmu Xen pagetable handling, including the machinery to implement direct pagetables. Xen presents the real CPU's pagetables directly to guests, with no added shadowing or other layer of abstraction. Naturally this means the hypervisor must maintain close control over what the guest can put into the pagetable. When the guest modifies the pte/pmd/pgd, it must convert its domain-specific notion of a "physical" pfn into a global machine frame number (mfn) before inserting the entry into the pagetable. Xen will check to make sure the domain is allowed to create a mapping of the given mfn. Xen also requires that all mappings the guest has of its own active pagetable are read-only. This is relatively easy to implement in Linux because all pagetables share the same pte pages for kernel mappings, so updating the pte in one pagetable will implicitly update the mapping in all pagetables. Normally a pagetable becomes active when you point to it with cr3 (or the Xen equivalent), but when you do so, Xen must check the whole pagetable for correctness, which is clearly a performance problem. Xen solves this with pinning which keeps a pagetable effectively active even if its currently unused, which means that all the normal update rules are enforced. This means that it need not revalidate the pagetable when loading cr3. This patch has a first-cut implementation of pinning, but it is more fully implemented in a later patch. Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com> Signed-off-by: Chris Wright <chrisw@sous-sol.org>	2007-07-18 08:47:42 -07:00
Jeremy Fitzhardinge	5ead97c84f	xen: Core Xen implementation This patch is a rollup of all the core pieces of the Xen implementation, including: - booting and setup - pagetable setup - privileged instructions - segmentation - interrupt flags - upcalls - multicall batching BOOTING AND SETUP The vmlinux image is decorated with ELF notes which tell the Xen domain builder what the kernel's requirements are; the domain builder then constructs the address space accordingly and starts the kernel. Xen has its own entrypoint for the kernel (contained in an ELF note). The ELF notes are set up by xen-head.S, which is included into head.S. In principle it could be linked separately, but it seems to provoke lots of binutils bugs. Because the domain builder starts the kernel in a fairly sane state (32-bit protected mode, paging enabled, flat segments set up), there's not a lot of setup needed before starting the kernel proper. The main steps are: 1. Install the Xen paravirt_ops, which is simply a matter of a structure assignment. 2. Set init_mm to use the Xen-supplied pagetables (analogous to the head.S generated pagetables in a native boot). 3. Reserve address space for Xen, since it takes a chunk at the top of the address space for its own use. 4. Call start_kernel() PAGETABLE SETUP Once we hit the main kernel boot sequence, it will end up calling back via paravirt_ops to set up various pieces of Xen specific state. One of the critical things which requires a bit of extra care is the construction of the initial init_mm pagetable. Because Xen places tight constraints on pagetables (an active pagetable must always be valid, and must always be mapped read-only to the guest domain), we need to be careful when constructing the new pagetable to keep these constraints in mind. It turns out that the easiest way to do this is use the initial Xen-provided pagetable as a template, and then just insert new mappings for memory where a mapping doesn't already exist. This means that during pagetable setup, it uses a special version of xen_set_pte which ignores any attempt to remap a read-only page as read-write (since Xen will map its own initial pagetable as RO), but lets other changes to the ptes happen, so that things like NX are set properly. PRIVILEGED INSTRUCTIONS AND SEGMENTATION When the kernel runs under Xen, it runs in ring 1 rather than ring 0. This means that it is more privileged than user-mode in ring 3, but it still can't run privileged instructions directly. Non-performance critical instructions are dealt with by taking a privilege exception and trapping into the hypervisor and emulating the instruction, but more performance-critical instructions have their own specific paravirt_ops. In many cases we can avoid having to do any hypercalls for these instructions, or the Xen implementation is quite different from the normal native version. The privileged instructions fall into the broad classes of: Segmentation: setting up the GDT and the GDT entries, LDT, TLS and so on. Xen doesn't allow the GDT to be directly modified; all GDT updates are done via hypercalls where the new entries can be validated. This is important because Xen uses segment limits to prevent the guest kernel from damaging the hypervisor itself. Traps and exceptions: Xen uses a special format for trap entrypoints, so when the kernel wants to set an IDT entry, it needs to be converted to the form Xen expects. Xen sets int 0x80 up specially so that the trap goes straight from userspace into the guest kernel without going via the hypervisor. sysenter isn't supported. Kernel stack: The esp0 entry is extracted from the tss and provided to Xen. TLB operations: the various TLB calls are mapped into corresponding Xen hypercalls. Control registers: all the control registers are privileged. The most important is cr3, which points to the base of the current pagetable, and we handle it specially. Another instruction we treat specially is CPUID, even though its not privileged. We want to control what CPU features are visible to the rest of the kernel, and so CPUID ends up going into a paravirt_op. Xen implements this mainly to disable the ACPI and APIC subsystems. INTERRUPT FLAGS Xen maintains its own separate flag for masking events, which is contained within the per-cpu vcpu_info structure. Because the guest kernel runs in ring 1 and not 0, the IF flag in EFLAGS is completely ignored (and must be, because even if a guest domain disables interrupts for itself, it can't disable them overall). (A note on terminology: "events" and interrupts are effectively synonymous. However, rather than using an "enable flag", Xen uses a "mask flag", which blocks event delivery when it is non-zero.) There are paravirt_ops for each of cli/sti/save_fl/restore_fl, which are implemented to manage the Xen event mask state. The only thing worth noting is that when events are unmasked, we need to explicitly see if there's a pending event and call into the hypervisor to make sure it gets delivered. UPCALLS Xen needs a couple of upcall (or callback) functions to be implemented by each guest. One is the event upcalls, which is how events (interrupts, effectively) are delivered to the guests. The other is the failsafe callback, which is used to report errors in either reloading a segment register, or caused by iret. These are implemented in i386/kernel/entry.S so they can jump into the normal iret_exc path when necessary. MULTICALL BATCHING Xen provides a multicall mechanism, which allows multiple hypercalls to be issued at once in order to mitigate the cost of trapping into the hypervisor. This is particularly useful for context switches, since the 4-5 hypercalls they would normally need (reload cr3, update TLS, maybe update LDT) can be reduced to one. This patch implements a generic batching mechanism for hypercalls, which gets used in many places in the Xen code. Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com> Signed-off-by: Chris Wright <chrisw@sous-sol.org> Cc: Ian Pratt <ian.pratt@xensource.com> Cc: Christian Limpach <Christian.Limpach@cl.cam.ac.uk> Cc: Adrian Bunk <bunk@stusta.de>	2007-07-18 08:47:42 -07:00
Jeremy Fitzhardinge	a42089dd35	xen: Add Xen interface header files Add Xen interface header files. These are taken fairly directly from the Xen tree, but somewhat rearranged to suit the kernel's conventions. Define macros and inline functions for doing hypercalls into the hypervisor. Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com> Signed-off-by: Ian Pratt <ian.pratt@xensource.com> Signed-off-by: Christian Limpach <Christian.Limpach@cl.cam.ac.uk> Signed-off-by: Chris Wright <chrisw@sous-sol.org>	2007-07-18 08:47:42 -07:00
Jeremy Fitzhardinge	24037a8b69	Add nosegneg capability to the vsyscall page notes Add the "nosegneg" fake capabilty to the vsyscall page notes. This is used by the runtime linker to select a glibc version which then disables negative-offset accesses to the thread-local segment via %gs. These accesses require emulation in Xen (because segments are truncated to protect the hypervisor address space) and avoiding them provides a measurable performance boost. Signed-off-by: Ian Pratt <ian.pratt@xensource.com> Signed-off-by: Christian Limpach <Christian.Limpach@cl.cam.ac.uk> Signed-off-by: Chris Wright <chrisw@sous-sol.org> Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com> Acked-by: Zachary Amsden <zach@vmware.com> Cc: Roland McGrath <roland@redhat.com> Cc: Ulrich Drepper <drepper@redhat.com>	2007-07-18 08:47:42 -07:00
Jeremy Fitzhardinge	688340ea34	Add a sched_clock paravirt_op The tsc-based get_scheduled_cycles interface is not a good match for Xen's runstate accounting, which reports everything in nanoseconds. This patch replaces this interface with a sched_clock interface, which matches both Xen and VMI's requirements. In order to do this, we: 1. replace get_scheduled_cycles with sched_clock 2. hoist cycles_2_ns into a common header 3. update vmi accordingly One thing to note: because sched_clock is implemented as a weak function in kernel/sched.c, we must define a real function in order to override this weak binding. This means the usual paravirt_ops technique of using an inline function won't work in this case. Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com> Cc: Zachary Amsden <zach@vmware.com> Cc: Dan Hecht <dhecht@vmware.com> Cc: john stultz <johnstul@us.ibm.com>	2007-07-18 08:47:42 -07:00
Jeremy Fitzhardinge	d572929cdd	paravirt: helper to disable all IO space In a virtual environment, device drivers such as legacy IDE will waste quite a lot of time probing for their devices which will never appear. This helper function allows a paravirt implementation to lay claim to the whole iomem and ioport space, thereby disabling all device drivers trying to claim IO resources. Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com> Signed-off-by: Chris Wright <chrisw@sous-sol.org> Cc: Rusty Russell <rusty@rustcorp.com.au>	2007-07-18 08:47:42 -07:00
Jeremy Fitzhardinge	5f4352fbff	Allocate and free vmalloc areas Allocate/release a chunk of vmalloc address space: alloc_vm_area reserves a chunk of address space, and makes sure all the pagetables are constructed for that address range - but no pages. free_vm_area releases the address space range. Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com> Signed-off-by: Ian Pratt <ian.pratt@xensource.com> Signed-off-by: Christian Limpach <Christian.Limpach@cl.cam.ac.uk> Signed-off-by: Chris Wright <chrisw@sous-sol.org> Cc: "Jan Beulich" <JBeulich@novell.com> Cc: "Andi Kleen" <ak@muc.de>	2007-07-18 08:47:41 -07:00
Jeremy Fitzhardinge	bdef40a6af	paravirt: export __supported_pte_mask __supported_pte_mask is needed when constructing pte values. Xen device drivers need to do this to make mappings of foreign pages (ie, pages granted to us by other domains). Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com>	2007-07-18 08:47:41 -07:00

... 2 3 4 5 6 ...

61422 Commits