Commit 1058a75f07 ("xen: actually release
memory when shrinking domain") causes a crash if the page being released
is a highmem page.
If a page is highmem then there is no need to unmap it.
Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
Acked-by: Jeremy Fitzhardinge <jeremy@goop.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
On -rt we were seeing spurious bad page states like:
Bad page state in process 'firefox'
page:c1bc2380 flags:0x40000000 mapping:c1bc2390 mapcount:0 count:0
Trying to fix it up, but a reboot is needed
Backtrace:
Pid: 503, comm: firefox Not tainted 2.6.26.8-rt13 #3
[<c043d0f3>] ? printk+0x14/0x19
[<c0272d4e>] bad_page+0x4e/0x79
[<c0273831>] free_hot_cold_page+0x5b/0x1d3
[<c02739f6>] free_hot_page+0xf/0x11
[<c0273a18>] __free_pages+0x20/0x2b
[<c027d170>] __pte_alloc+0x87/0x91
[<c027d25e>] handle_mm_fault+0xe4/0x733
[<c043f680>] ? rt_mutex_down_read_trylock+0x57/0x63
[<c043f680>] ? rt_mutex_down_read_trylock+0x57/0x63
[<c0218875>] do_page_fault+0x36f/0x88a
This is the case where a concurrent fault already installed the PTE and
we get to free the newly allocated one.
This is due to pgtable_page_ctor() doing the spin_lock_init(&page->ptl)
which is overlaid with the {private, mapping} struct.
union {
struct {
unsigned long private;
struct address_space *mapping;
};
spinlock_t ptl;
struct kmem_cache *slab;
struct page *first_page;
};
Normally the spinlock is small enough to not stomp on page->mapping, but
PREEMPT_RT=y has huge 'spin'locks.
But lockdep kernels should also be able to trigger this splat, as the
lock tracking code grows the spinlock to cover page->mapping.
The obvious fix is calling pgtable_page_dtor() like the regular pte free
path __pte_free_tlb() does.
It seems all architectures except x86 and nm10300 already do this, and
nm10300 doesn't seem to use pgtable_page_ctor(), which suggests it
doesn't do SMP or simply doesnt do MMU at all or something.
Signed-off-by: Peter Zijlstra <a.p.zijlsta@chello.nl>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Cc: <stable@kernel.org>
Add wrapper functions for the following compat system calls:
* readahead
* sendfile64
* tkill
* tgkill
* keyctl
This ensures that the high order bits of the parameter registers are correctly
sign extended.
Signed-off-by: Michael Holzheu <holzheu@linux.vnet.ibm.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Precreate stop_machine threads in case the machine supports ETR/STP.
Otherwise we might deadlock if a time sync operation gets scheduled
and the creation of stop_machine threads would cause disk I/O.
This is just the minimal fix.
The real fix would be to only precreate stop_machine threads if
ETR/STP is actually used. But that would be a rather large and
complicated patch.
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
On (initial) cpu hotplug the lowcore values for user_timer and
system_timer don't get initialized like they would get on each
process schedule.
On initial start of secondary cpus this leads to the situation
where per thread user/system_timer values are larger than the
corresponding contents of the lowcore. When later calculating
time spent in user/system context the result can be negative.
So for cpu hotplug we should manually initialize lowcore values.
Fixes this bug:
Kernel BUG at 000ec080 [verbose debug info unavailable]
fixpoint divide exception: 0009 [#1] PREEMPT SMP
Modules linked in:
CPU: 10 Not tainted 2.6.28 #4
Process sysctl (pid: 975, task: 3fa752e0, ksp: 3fbebca0)
Krnl PSW : 070c1000 800ec080 (show_stat+0x390/0x5fc)
R:0 T:1 IO:1 EX:1 Key:0 M:1 W:0 P:0 AS:0 CC:1 PM:0
Krnl GPRS: 7fffffff fefc7ce5 3faec080 003879ae
00000001 01388000 7fffffff 01388000
00000000 00000000 0049ad50 3fbebcf8
01388000 002f51a8 800ec1fe 3fbebcf8
Krnl Code: 800ec076: 9001b188 stm %r0,%r1,392(%r11)
800ec07a: 9801b0c0 lm %r0,%r1,192(%r11)
800ec07e: 1d05 dr %r0,%r5
>800ec080: 9001b0c0 stm %r0,%r1,192(%r11)
800ec084: 5860b0c4 l %r6,196(%r11)
800ec088: 1806 lr %r0,%r6
800ec08a: 8c800001 srdl %r8,1
800ec08e: 1d87 dr %r8,%r7
Call Trace:
([<00000000000ec1ee>] show_stat+0x4fe/0x5fc)
[<00000000000c13e8>] seq_read+0xc4/0x3ac
[<00000000000e4796>] proc_reg_read+0x6e/0x9c
[<00000000000a6a44>] vfs_read+0x78/0x100
[<00000000000a6ba8>] sys_read+0x40/0x80
[<00000000000234a8>] sysc_do_restart+0x1a/0x1e
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
When 31 bit user space programs call sigaltstack on a 64 bit Linux
OS, the system call returns -1 with errno=EFAULT. The 31 bit pointer passed
to the system call is extended to 64 bit, but the high order bits are not
set to zero. The kernel detects the invalid user space pointer and
returns -EFAULT. To solve the problem, sys32_sigaltstack_wrapper()
instead of sys32_sigaltstack() has to be called. The wrapper function sets
the high order bits to zero.
Signed-off-by: Michael Holzheu <holzheu@linux.vnet.ibm.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Use the personality() macro to mask out all bits that are not
relevant for the personality type.
The personality field contains bits for other things as well,
so without masking out the not relevalent bits the comparison
won't do what is expected.
Reported-by: Andreas Krebbel <krebbel@linux.vnet.ibm.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
LLTX is deprecated and complicated, don't use it. It was observed by Don Ash
<donash4@gmail.com> that e1000e was acquiring this lock in the NAPI cleanup
path. This is obviously a bug, as this is a leftover from when e1000
supported multiple tx queues and fake netdevs.
another user reported this to us and tested routing with the 2.6.27 kernel and
this patch and reported a 3.5 % improvement in packets forwarded in a
multi-port test on 82571 parts.
Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Signed-off-by: Bruce Allan <bruce.w.allan@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Some revisions of the 92hd8xxx codec's not supporting port power
downs in which the using of it causes capture and also randomly
playback streams to not function at all. Thus by disabling it by
default and adding a option to enable it manually will fix all issue
on current and future revisions.
Signed-off-by: Matthew Ranostay <mranostay@embeddedalley.com>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Port 0xe power mapping was incorrect set to 0x80 changed to the correct
value 0x40.
Signed-off-by: Matthew Ranostay <mranostay@embeddedalley.com>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
There is a race between sctp_rcv() and sctp_accept() where we
have moved the association from the listening socket to the
accepted socket, but sctp_rcv() processing cached the old
socket and continues to use it.
The easy solution is to check for the socket mismatch once we've
grabed the socket lock. If we hit a mis-match, that means
that were are currently holding the lock on the listening socket,
but the association is refrencing a newly accepted socket. We need
to drop the lock on the old socket and grab the lock on the new one.
A more proper solution might be to create accepted sockets when
the new association is established, similar to TCP. That would
eliminate the race for 1-to-1 style sockets, but it would still
existing for 1-to-many sockets where a user wished to peeloff an
association. For now, we'll live with this easy solution as
it addresses the problem.
Reported-by: Michal Hocko <mhocko@suse.cz>
Reported-by: Karsten Keil <kkeil@suse.de>
Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Recent changes to the retransmit code exposed a long standing
bug where it was possible for a chunk to be time stamped
after the retransmit timer was reset. This caused a rare
situation where the retrnamist timer has expired, but
nothing was marked for retrnasmission because all of
timesamps on data were less then 1 rto ago. As result,
the timer was never restarted since nothing was retransmitted,
and this resulted in a hung association that did couldn't
complete the data transfer. The solution is to timestamp
the chunk when it's added to the packet for transmission
purposes. After the packet is trsnmitted the rtx timer
is restarted. This guarantees that when the timer expires,
there will be data to retransmit.
Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Commit 62aeaff5cc
(sctp: Start T3-RTX timer when fast retransmitting lowest TSN)
introduced a regression where it was possible to forcibly
restart the sctp retransmit timer at the transmission of any
new chunk. This resulted in much longer timeout times and
sometimes hung sctp connections.
Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
crc32c algorithm provides a byteswaped result. On little-endian
arches, the result ends up in big-endian/network byte order.
On big-endinan arches, the result ends up in little-endian
order and needs to be byte swapped again. Thus calling cpu_to_le32
gives the right output.
Tested-by: Jukka Taimisto <jukka.taimisto@mail.suomi.net>
Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Fix this:
> It appears that in the upstream balloon driver,
> the call to HYPERVISOR_update_va_mapping is missing
> from decrease_reservation. I think as a result,
> the balloon driver is eating memory but not
> releasing it to Xen, thus rendering the balloon
> driver essentially useless. (Can be observed via xentop.)
Signed-off-by: Dan Magenheimer <dan.magenheimer@oracle.com>
Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
All supported SMSC PHYs implement the standard "power down" bit 11 of
BMCR, so this patch adds support using the generic genphy_{suspend,resume}
functions.
Signed-off-by: Steve Glendinning <steve.glendinning@smsc.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch reorders functions so that we do longer need
forward declarations.
Signed-off-by: Florian Fainelli <florian@openwrt.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
This last patch makes the appropriate changes to use and propagate the
network namespace where needed in IPv4 multicast routing code.
This consists mainly in replacing all the remaining init_net occurences
with current netns pointer retrieved from sockets, net devices or
mfc_caches depending on the routines' contexts.
Some routines receive a new 'struct net' parameter to propagate the current
netns:
* vif_add/vif_delete
* ipmr_new_tunnel
* mroute_clean_tables
* ipmr_cache_find
* ipmr_cache_report
* ipmr_cache_unresolved
* ipmr_mfc_add/ipmr_mfc_delete
* ipmr_get_route
* rt_fill_info (in route.c)
Signed-off-by: Benjamin Thery <benjamin.thery@bull.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Declare IPv4 multicast forwarding /proc/net entries per-namespace:
/proc/net/ip_mr_vif
/proc/net/ip_mr_cache
Signed-off-by: Benjamin Thery <benjamin.thery@bull.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Preliminary work to make IPv4 multicast routing netns-aware.
Declare variable 'reg_vif_num' per-namespace, move into struct netns_ipv4.
At the moment, this variable is only referenced in init_net.
Signed-off-by: Benjamin Thery <benjamin.thery@bull.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Preliminary work to make IPv4 multicast routing netns-aware.
Declare IPv multicast routing variables 'mroute_do_assert' and
'mroute_do_pim' per-namespace in struct netns_ipv4.
At the moment, these variables are only referenced in init_net.
Signed-off-by: Benjamin Thery <benjamin.thery@bull.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Preliminary work to make IPv4 multicast routing netns-aware.
Declare variable cache_resolve_queue_len per-namespace: move it into
struct netns_ipv4.
This variable counts the number of unresolved cache entries queued in the
list mfc_unres_queue. This list is kept global to all netns as the number
of entries per namespace is limited to 10 (hardcoded in routine
ipmr_cache_unresolved).
Entries belonging to different namespaces in mfc_unres_queue will be
identified by matching the mfc_net member introduced previously in
struct mfc_cache.
Keeping this list global to all netns, also allows us to keep a single
timer (ipmr_expire_timer) to handle their expiration.
In some places cache_resolve_queue_len value was tested for arming
or deleting the timer. These tests were equivalent to testing
mfc_unres_queue value instead and are replaced in this patch.
At the moment, cache_resolve_queue_len is only referenced in init_net.
Signed-off-by: Benjamin Thery <benjamin.thery@bull.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Preliminary work to make IPv4 multicast routing netns-aware.
Dynamically allocate IPv4 multicast forwarding cache, mfc_cache_array,
and move it to struct netns_ipv4.
At the moment, mfc_cache_array is only referenced in init_net.
Signed-off-by: Benjamin Thery <benjamin.thery@bull.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch stores into struct mfc_cache the network namespace each
mfc_cache belongs to. The new member is mfc_net.
mfc_net is assigned at cache allocation and doesn't change during
the rest of the cache entry life.
A new net parameter is added to ipmr_cache_alloc/ipmr_cache_alloc_unres.
This will help to retrieve the current netns around the IPv4 multicast
routing code.
At the moment, all mfc_cache are allocated in init_net.
Signed-off-by: Benjamin Thery <benjamin.thery@bull.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Preliminary work to make IPv6 multicast routing netns-aware.
Dynamically allocate interface table vif_table and move it to
struct netns_ipv4, and update MIF_EXISTS() macro.
At the moment, vif_table is only referenced in init_net.
Signed-off-by: Benjamin Thery <benjamin.thery@bull.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Preliminary work to make IPv4 multicast routing netns-aware.
Make IPv4 multicast routing mroute_socket per-namespace,
moves it into struct netns_ipv4.
At the moment, mroute_socket is only referenced in init_net.
Signed-off-by: Benjamin Thery <benjamin.thery@bull.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
While was playing with PPP namespaces I occasionally brought
back DECLARE_MAC_BUF which is not needed (we have %pM here).
Fix it.
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Suspend/resume routines check for phydrv != NULL, but that is
wrong because "phydrv" comes from container_of(drv). If drv is NULL,
then container_of(drv) will return non-NULL result, and the checks
won't work.
The Freescale TBI PHYs are driver-less, so "drv" is NULL, and that
leads to the following oops:
Unable to handle kernel paging request for data at address 0xffffffe4
Faulting instruction address: 0xc0215554
Oops: Kernel access of bad area, sig: 11 [#1]
[...]
NIP [c0215554] mdio_bus_suspend+0x34/0x70
LR [c01cc508] suspend_device+0x258/0x2bc
Call Trace:
[cfad3da0] [cfad3db8] 0xcfad3db8 (unreliable)
[cfad3db0] [c01cc508] suspend_device+0x258/0x2bc
[cfad3dd0] [c01cc62c] dpm_suspend+0xc0/0x140
[cfad3e20] [c01cc6f4] device_suspend+0x48/0x5c
[cfad3e40] [c0068dd8] suspend_devices_and_enter+0x8c/0x148
[cfad3e60] [c00690f8] enter_state+0x100/0x118
[cfad3e80] [c00691c0] state_store+0xb0/0xe4
[cfad3ea0] [c018c938] kobj_attr_store+0x24/0x3c
[cfad3eb0] [c00ea9a8] flush_write_buffer+0x58/0x7c
[cfad3ed0] [c00eadf0] sysfs_write_file+0x58/0xa0
[cfad3ef0] [c009e810] vfs_write+0xb4/0x16c
[cfad3f10] [c009ed40] sys_write+0x4c/0x90
[cfad3f40] [c0014954] ret_from_syscall+0x0/0x38
[...]
This patch fixes the issue, plus removes unneeded parentheses
and fixes indentation level in mdio_bus_suspend().
Signed-off-by: Anton Vorontsov <avorontsov@ru.mvista.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
A link change interrupt might be queued and activated after the loopback was set
and it will cause the loopback to fail. The PHY lock should be kept until the
loopback test is over.
That implies that the bnx2x_test_link should used within the loopback function
and not bnx2x_wait_for_link since that function also takes the PHY link
Signed-off-by: Eilon Greenstein <eilong@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Waiting for the FW to response requires a memory barrier
Signed-off-by: Michal Kalderon <michals@broadcom.com>
Signed-off-by: Eilon Greenstein <eilong@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
rmmod might hang without this patch since the reference counter is not going
down
Signed-off-by: Yitchak Gertner <gertner@broadcom.com>
Signed-off-by: Eilon Greenstein <eilong@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Call carrier off should not be called after register_netdev since after
register netdev open can be called at any time followed by an interrupt that
will set it to carrier_on and the probe will resume control and set it to off
Signed-off-by: Eilon Greenstein <eilong@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Failures on load were not handled correctly - separate the flow to handle
different failures
Signed-off-by: Eilon Greenstein <eilong@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Calling napi disabled unconditionally at netif stop
Signed-off-by: Eilon Greenstein <eilong@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
To avoid complications, make sure that the HW is in reset (as it should be)
before trying to take it out of reset. In normal flows, the HW is indeed in rest
so this should have no effect
Signed-off-by: Eilon Greenstein <eilong@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
be consistent with mac80211 drivers and return correct return code.
NETDEV_TX_OK is 0, but we need to be consistent wrt formatting amongst
implementations
re: http://marc.info/?l=linux-wireless&m=123119327419865&w=2
Signed-off-by: Reinette Chatre <reinette.chatre@intel.com>
Reviewed-by: Tomas Winkler <tomas.winkler@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Giuseppe Cala <jiveaxe@gmail.com> (The second "a" in "Cala" should be
a grave, U+00E0) reported success on zd1211-devs@lists.sourceforge.net.
The chip info is:
zd1211b chip 0df6:0036 v4810 high 00-0c-f6 AL2230_RF pa0 g--N-
The Sitecom WL-603 is detected as a zd1211b with a AL2230 RF transceiver chip.
Signed-off-by: Giuseppe Cala <jiveaxe@gmail.com>
Signed-off-by: Hin-Tak Leung <htl10@users.sourceforge.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
In theory, the firmware acks the received a data frame, before signaling the driver to free it again.
However Artur Skawina <art.08.09@gmail.com> has shown that it can happen in reverse order as well.
This is very bad and could lead to memory corruptions, oopses and panics.
Thanks to Artur Skawina <art.08.09@gmail.com> for reporting and debugging this issue.
Signed-off-by: Christian Lamparter <chunkeey@web.de>
Tested-by: Artur Skawina <art.08.09@gmail.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
If we let the firmware do the data encryption, we have to remove the ICV and
(M)MIC at the end of the frame before we can give it back to mac80211.
Or, these data frames have a few trailing bytes on cooked monitor interfaces.
Signed-off-by: Christian Lamparter <chunkeey@web.de>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
This patch fixes a obvious memory leak in the eeprom parser.
Signed-off-by: Christian Lamparter <chunkeey@web.de>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
KERN_INFO is too "loud" for messages that are generated by the ordinary
events, such as accociation. Use of KERN_DEBUG is consistent with
mac80211.
Suggested by Michael Gilbert <michael.s.gilbert@gmail.com>
Signed-off-by: Pavel Roskin <proski@gnu.org>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Mac80211 provides 2 structures to handle bitrates, namely
ieee80211_rate and ieee80211_tx_rate. To determine the short preamble
mode for an outgoing frame, the flag IEEE80211_TX_RC_USE_SHORT_PREAMBLE
must be checked on ieee80211_tx_rate and not ieee80211_rate (which rt2x00 did).
This fixes a regression which was triggered in 2.6.29-rcX as reported by Chris Clayton.
Signed-off-by: Ivo van Doorn <IvDoorn@gmail.com>
Tested-By: Chris Clayton <chris2553@googlemail.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
wlan0: switched to short barker preamble (BSSID=00:01:aa:bb:cc:dd)
wlan0: switched to short slot (BSSID=) <something is missing here>
should be:
wlan0: switched to short barker preamble (BSSID=00:01:aa:bb:cc:dd)
wlan0: switched to short slot (BSSID=00:01:aa:bb:cc:dd)
Signed-off-by: Christian Lamparter <chunkeey@web.de>
Signed-off-by: John W. Linville <linville@tuxdriver.com>