As was already pointed out Mathieu Avila on Thu, 07 Sep 2006 03:15:25 -0700
that OCFS2 is expecting bio_add_page() to add pages to BIOs in an easily
predictable manner.
That is not true, especially for devices with own merge_bvec_fn().
Therefore OCFS2's heartbeat code is very likely to fail on such devices.
Move the bio_put() call into the bio's bi_end_io() function. This makes the
whole idea of trying to predict the behaviour of bio_add_page() unnecessary.
Removed compute_max_sectors() and o2hb_compute_request_limits().
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
When there is a lot of multithreaded I/O usage, two threads can collide
while sending out a message to the other nodes. This is due to the lack of
locking between threads while sending out the messages.
When a connected TCP send(), sendto(), or sendmsg() arrives in the Linux
kernel, it eventually comes through tcp_sendmsg(). tcp_sendmsg() protects
itself by acquiring a lock at invocation by calling lock_sock().
tcp_sendmsg() then loops over the buffers in the iovec, allocating
associated sk_buff's and cache pages for use in the actual send. As it does
so, it pushes the data out to tcp for actual transmission. However, if one
of those allocation fails (because a large number of large sends is being
processed, for example), it must wait for memory to become available. It
does so by jumping to wait_for_sndbuf or wait_for_memory, both of which
eventually cause a call to sk_stream_wait_memory(). sk_stream_wait_memory()
contains a code path that calls sk_wait_event(). Finally, sk_wait_event()
contains the call to release_sock().
The following patch adds a lock to the socket container in order to
properly serialize outbound requests.
From: Zhen Wei <zwei@novell.com>
Acked-by: Jeff Mahoney <jeffm@suse.com>
Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
OCFS2: drop 'depends on INET' since local mounts are now allowed.
Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
Currently the ocfs2 dlm has no timeout during dlm join domain. While this is
not a problem in normal operation, this does become an issue if, say, the
other node is refusing to let the node join the domain because of a stuck
recovery. This patch adds a 90 sec timeout.
Signed-off-by: Sunil Mushran <sunil.mushran@oracle.com>
Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
These messages can easily be activated using the mlog infrastructure
and don't need to be enabled by default.
Signed-off-by: Sunil Mushran <sunil.mushran@oracle.com>
Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
There is a small window where a joining node may not see the node(s) that
just died but are still part of the domain. To fix this, we must disallow
join requests if the joining node has a different node map.
A new field node_map is added to dlm_query_join_request to send the current
nodes nodemap along with join request. On the receiving end the nodes that
are part of the cluster verifies if this new node sees all the nodes that
are still part of the cluster. They disallow the join if the maps mismatch.
Signed-off-by: Srinivas Eeda <srinivas.eeda@oracle.com>
Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
Eventhough the set refmap bit message is sent before the clear refmap
message, currently there is no guarentee that the set message will be
handled before the clear. This patch prevents the clear refmap to be
processed while the node is sending assert master messages to other
nodes. (The set refmap message is sent as a response to the assert
master request).
Signed-off-by: Sunil Mushran <sunil.mushran@oracle.com>
Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
This patch binds the o2net listener to the configured ip address
instead of INADDR_ANY for security. Fixes oss.oracle.com bugzilla#814.
Signed-off-by: Sunil Mushran <sunil.mushran@oracle.com>
Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
This patch prevents the dlm from sending the clear refmap message
before the set refmap. We use the newly created post function handler
routine to accomplish the task.
Signed-off-by: Kurt Hackel <kurt.hackel@oracle.com>
Signed-off-by: Sunil Mushran <sunil.mushran@oracle.com>
Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
Currently o2net allows one handler function per message type. This
patch adds the ability to call another function to be called after
the handler has returned the message to the other node.
Handlers are now given the option of returning a context (in the form of a
void **) which will be passed back into the post message handler function.
Signed-off-by: Kurt Hackel <kurt.hackel@oracle.com>
Signed-off-by: Sunil Mushran <sunil.mushran@oracle.com>
Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
The dlm encodes the node number and a sequence number in the lock cookie.
It also stores the cookie in the lockres in the big endian format to avoid
swapping 8 bytes on each lock request. The bug here was that it was assuming
the cookie to be in the cpu format when decoding it for printing the error
message. This patch swaps the bytes before the print.
Signed-off-by: Kurt Hackel <kurt.hackel@oracle.com>
Signed-off-by: Sunil Mushran <sunil.mushran@oracle.com>
Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
When the lockres is in migrate or recovery state, all convert requests
are denied with the appropriate error status that is handled on the
requester node. This patch silences the erroneous error message printed
on the master node.
Signed-off-by: Kurt Hackel <kurt.hackel@oracle.com>
Signed-off-by: Sunil Mushran <sunil.mushran@oracle.com>
Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
The dlm was not waking up threads waiting on the lockres wait queue,
waiting for the lockres to be no longer be in the DLM_LOCK_RES_IN_PROGRESS
and the DLM_LOCK_RES_MIGRATING states.
Signed-off-by: Kurt Hackel <kurt.hackel@oracle.com>
Signed-off-by: Sunil Mushran <sunil.mushran@oracle.com>
Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
dlm_dispatch_work was not processing the queued up tasks at
the first sign of the node leaving the domain leading to not
only incompleted tasks but also a mismatch in the dlm refcnt.
Signed-off-by: Kurt Hackel <kurt.hackel@oracle.com>
Signed-off-by: Sunil Mushran <sunil.mushran@oracle.com>
Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
This is to prevent the condition in which a previously queued
up assert master asserts after we start the migration. Now
migration ensures the workqueue is flushed before proceeding
with migrating the lock to another node. This condition is
typically encountered during parallel umounts.
Signed-off-by: Kurt Hackel <kurt.hackel@oracle.com>
Signed-off-by: Sunil Mushran <sunil.mushran@oracle.com>
Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
The migrate lockres handler was only searching for its lock on
migrated lockres on the expected queue. This could be problematic
as the new master could have also issued a convert request
during the migration and thus moved the lock to the convert queue.
We now search for the lock on all three queues.
Signed-off-by: Kurt Hackel <kurt.hackel@oracle.com>
Signed-off-by: Sunil Mushran <Sunil.Mushran@oracle.com>
Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
dlmunlock() was not waiting for migration to complete before releasing locks
on locally mastered locks.
Signed-off-by: Kurt Hackel <kurt.hackel@oracle.com>
Signed-off-by: Sunil Mushran <Sunil.Mushran@oracle.com>
Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
dlmthread was removing lockres' from the dirty list
and resetting the dirty flag before shuffling the list.
This patch retains the dirty state flag until the lists
are shuffled.
Signed-off-by: Kurt Hackel <kurt.hackel@oracle.com>
Signed-off-by: Sunil Mushran <Sunil.Mushran@oracle.com>
Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
This patch makes some needlessly global functions static.
Signed-off-by: Adrian Bunk <bunk@stusta.de>
Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
This was previously broken and migration of some locks had to be temporarily
disabled. We use a new (and backward-incompatible) set of network messages
to account for all references to a lock resources held across the cluster.
once these are all freed, the master node may then free the lock resource
memory once its local references are dropped.
Signed-off-by: Kurt Hackel <kurt.hackel@oracle.com>
Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/shaggy/jfs-2.6:
JFS: Remove incorrect kgdb define
JFS: call io_schedule() instead of schedule() to avoid deadlock
JFS: Add lockdep annotations
JFS: Avoid BUG() on a damaged file system
* git://git.kernel.org/pub/scm/linux/kernel/git/steve/gfs2-2.6-nmw: (57 commits)
[GFS2] make gfs2_writepages() static
[GFS2] Unlock page on prepare_write try lock failure
[GFS2] nfsd readdirplus assertion failure
[DLM] fix softlockup in dlm_recv
[DLM] zero new user lvbs
[DLM/GFS2] indent help text
[GFS2] Fix unlink deadlocks
[GFS2] Put back semaphore to avoid umount problem
[GFS2] more CURRENT_TIME_SEC
[GFS2/DLM] fix GFS2 circular dependency
[GFS2/DLM] use sysfs
[GFS2] make lock_dlm drop_count tunable in sysfs
[GFS2] increase default lock limit
[GFS2] Fix list corruption in lops.c
[GFS2] Fix recursive locking attempt with NFS
[DLM] can miss clearing resend flag
[DLM] saved dlm message can be dropped
[DLM] Make sock_sem into a mutex
[GFS2] Fix typo in glock.c
[GFS2] use CURRENT_TIME_SEC instead of get_seconds in gfs2
...
On Mon, Jan 29, 2007 at 08:45:28PM -0800, Andrew Morton wrote:
>...
> Changes since 2.6.20-rc6-mm2:
>...
> git-gfs2-nmw.patch
>...
> git trees
>...
This patch makes the needlessly global gfs2_writepages() static.
Signed-off-by: Adrian Bunk <bunk@stusta.de>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
When the try lock of the glock failed in prepare_write we were
incorrectly exiting this function with the page still locked.
This was resulting in further I/O to this page hanging.
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
* 'release' of git://git.kernel.org/pub/scm/linux/kernel/git/aegl/linux-2.6: (27 commits)
[IA64] swiotlb abstraction (e.g. for Xen)
[IA64] swiotlb cleanup
[IA64] make swiotlb use bus_to_virt/virt_to_bus
[IA64] swiotlb bug fixes
[IA64] Hook up getcpu system call for IA64
[IA64] clean up sparsemem memory_present call
[IA64] show_mem() for IA64 sparsemem NUMA
[IA64] missing exports hwsw_sync_...
[IA64] virt_to_page() can be called with NULL arg
[IA64] alignment bug in ldscript
[IA64] register memory ranges in a consistent manner
[IA64] Enable SWIOTLB only when needed
[IA64-SGI] Check for TIO errors on shub2 Altix
[IA64] remove bogus prototype ia64_esi_init()
[IA64] Clear IRQ affinity when unregistered
[IA64] fix ACPI Kconfig issues
[IA64] Fix NULL-pointer dereference in ia64_machine_kexec()
[IA64] find thread for user rbs address
[IA64] use snprintf() on features field of /proc/cpuinfo
[IA64] enable singlestep on system call
...
* 'for-linus' of master.kernel.org:/pub/scm/linux/kernel/git/jikos/hid:
USB HID: handle multi-interface devices for Apple macbook pro properly
HID: move away from DEBUG defines in favor of CONFIG_HID_DEBUG
USB HID: fix bogus comment in hid_get_class_descriptor()
USB HID: remove hid_find_field_by_usage()
HID: API - fix leftovers of hidinput API in USB HID
HID: hid debug from hid-debug.h to hid layer
hid: force feedback driver for PantherLord USB/PS2 2in1 Adapter
hid: quirk for multi-input devices with unneeded output reports
hid: allow force feedback for multi-input devices
* 'for-linus' of master.kernel.org:/pub/scm/linux/kernel/git/roland/infiniband:
IB/ehca: Remove obsolete prototypes
IB/ehca: Remove use of do_mmap()
RDMA/addr: Handle ethernet neighbour updates during route resolution
IB: Make sure struct ib_user_mad.data is aligned
IB/srp: Don't wait for response when QP is in error state.
IB: Return qp pointer as part of ib_wc
IB: Include <linux/kref.h> explicitly in <rdma/ib_verbs.h>
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/drzeus/mmc: (32 commits)
mmc: tifm: replace kmap with page_address
mmc: sdhci: fix voltage ocr
mmc: sdhci: replace kmap with page_address
mmc: wbsd: replace kmap with page_address
mmc: handle pci_enable_device() return value in sdhci
mmc: Proper unclaim in mmc_block
mmc: change wbsd mailing list
mmc: Graceful fallback for fancy features
mmc: Handle wbsd's stupid command list
mmc: Allow host drivers to specify max block count
mmc: Allow host drivers to specify a max block size
tifm_sd: add suspend and resume functionality
tifm_core: add suspend/resume infrastructure for tifm devices
tifm_7xx1: prettify
tifm_7xx1: recognize device 0xac8f as supported
tifm_7xx1: switch from workqueue to kthread
tifm_7xx1: Merge media insert and media remove functions
tifm_7xx1: simplify eject function
Add dummy_signal_irq function to save check in ISR
Remove unused return value from signal_irq callback
...
Fix the key serial number collision avoidance code in key_alloc_serial().
This didn't use to be so much of a problem as the key serial numbers were
allocated from a simple incremental counter, and it would have to go through
two billion keys before it could possibly encounter a collision. However, now
that random numbers are used instead, collisions are much more likely.
This is fixed by finding a hole in the rbtree where the next unused serial
number ought to be and using that by going almost back to the top of the
insertion routine and redoing the insertion with the new serial number rather
than trying to be clever and attempting to work out the insertion point
pointer directly.
This fixes kernel BZ #7727.
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* git://git.kernel.org/pub/scm/linux/kernel/git/sfrench/cifs-2.6:
[CIFS] Minor cleanup
[CIFS] Missing free in error path
[CIFS] Reduce cifs stack space usage
[CIFS] lseek polling returned stale EOF
* 'upstream' of git://ftp.linux-mips.org/pub/scm/upstream-linus: (40 commits)
[MIPS] Yosemite: Fix missing parens in SERIAL_READ_1 macro
[MIPS] Fix warnings in run_uncached on 32bit kernel
[MIPS] Comment fix
[MIPS] MT: Nuke duplicate mips_mt_regdump() prototype.
[MIPS] Alchemy: Fix PCI-memory access
[MIPS] Move .set reorder out of conditional code
[MIPS] Check FCSR for pending interrupts before restoring from a context.
[MIPS] Jaguar ATX: Fix large number of warnings.
[MIPS] Jaguar: Fix MAC address detection after platform_device conversion.
[MIPS] SMTC: Make a bunch of functions and variables static.
[MIPS] Use compat_sys_pselect6
[MIPS] SMTC: Cleanup idle hook invocation.
[MIPS] SELinux: Add security hooks to mips-mt {get,set}affinity
[MIPS] IRIX: Linux coding style cleanups.
[MIPS] PB1100: Fix pile of warnings
[MIPS] Alchemy: Fix bunch of warnings
[MIPS] Whitespace cleanups.
[MIPS] Alchemy: Fix bunch more warnings.
[MIPS] Use ARRAY_SIZE macro when appropriate
[MIPS] Fix some whitespace damage
...
Tildes as in path as in filenames are handled correctly now:
only files, containing tilde '~', are backups, thus are not valid.
[KJ]:
Definition of `space' was removed, scripts/Kbuild.include has one.
That definition was taken right from the GNU make manual, while Kbuild's
version is original.
Cc: Roman Zippel <zippel@linux-m68k.org>
Cc: Bastian Blank <bastian@waldi.eu.org>
Cc: Sam Ravnborg <sam@ravnborg.org>
Signed-off-by: Oleg Verych <olecom@flower.upol.cz>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
GNU binutils, root users, tmpfiles, external modules ro builds must
be fixed to do the right thing now.
Cc: Roman Zippel <zippel@linux-m68k.org>
Cc: Sam Ravnborg <sam@ravnborg.org>
Cc: Horst Schirmeier <horst@schirmeier.com>
Cc: Jan Beulich <jbeulich@novell.com>
Cc: Daniel Drake <dsd@gentoo.org>
Cc: Andi Kleen <ak@suse.de>
Cc: Randy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: Oleg Verych <olecom@flower.upol.cz>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Replacing overhead of using some (external) programs
instead of good old `sh'.
Cc: Roman Zippel <zippel@linux-m68k.org>
Cc: Sam Ravnborg <sam@ravnborg.org>
Cc: William Stearns <wstearns@pobox.com>
Cc: Martin Schlemmer <azarah@nosferatu.za.org>
Signed-off-by: Oleg Verych <olecom@flower.upol.cz>
Acked-by: Mark Lord <lkml@rtr.ca>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
arch/mips/lib/uncached.c: In function 'run_uncached':
arch/mips/lib/uncached.c:47: warning: comparison is always true due to limited range of data type
arch/mips/lib/uncached.c:48: warning: comparison is always false due to limited range of data type
arch/mips/lib/uncached.c:57: warning: comparison is always true due to limited range of data type
arch/mips/lib/uncached.c:58: warning: comparison is always false due to limited range of data type
Signed-off-by: Yoichi Yuasa <yoichi_yuasa@tripeaks.co.jp>
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
The problem was introduced in 2.6.18.3 with the casting of some
36bit-defines (PCI memory) in au1000.h to resource_size_t which may be
u32 or u64 depending on the experimental CONFIG_RESOURCES_64BIT.
With unset CONFIG_RESOURCES_64BIT, the pci-memory cannot be accessed
because the ioremap in arch/mips/au1000/common/pci.c already used the
truncated addresses.
With set CONFIG_RESOURCES_64BIT, things get even worse, because PCI-scan
aborts, due to resource conflict: request_resource() in arch/mips/pci/pci.c
fails because the maximum iomem-address is 0xffffffff (32bit) but the
pci-memory-start-address is 0x440000000 (36bit).
To get pci working again, I propose the following patch:
1. remove the resource_size_t-casting from au1000.h again
2. make the casting in arch/mips/au1000/common/pci.c (it's allowed and
necessary here. The 36bit-handling will be done in __fixup_bigphys_addr).
With this patch pci works again like in 2.6.18.2, the gcc-compile warnings
in pci.c are gone and it doesn't depend on CONFIG_EXPERIMENTAL.
Signed-off-by: Alexander Bigga <ab@mycable.de>
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
The N32 and O32 pselect6 syscalls need to use compat_sys_pselect6 to
translate arguments from 32-bit to 64-bit layout.
Signed-off-by: Joseph Myers <joseph@codesourcery.com>
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>