Want to read Slashdot from your mobile device? Point it at m.slashdot.org and keep reading!

 



Forgot your password?
typodupeerror
×
Unix Operating Systems Software

Open Source Distributed Shell Tools? 31

ColonelForbin74 asks: "While some may assume that most larger server clusters run advanced / custom software(i.e. Beowulf, cfengine, OSCAR), many of those stuck in the not-research-this-site-runs-production world know this simply isn't the case. Many people like myself are working with medium-to-large scale clusters with little help other than shell for() loops and some SSH trusted keys. What application-level tools are out there that might help SysAdmin / AppSupport types like myself run commands across a given cluster, push files out, etc? In my desperation to have some sort of tool in my toolbox, I've actually created one. However, I have a hard time believing this is the best thing out there, and would appreciate all the ideas and links I can get!"
This discussion has been archived. No new comments can be posted.

Open Source Distributed Shell Tools?

Comments Filter:
  • Try this for a start (Score:5, Informative)

    by epsalon ( 518482 ) <slash@alon.wox.org> on Tuesday June 17, 2003 @07:39PM (#6227529) Homepage Journal
    A lecture [haifux.org] from the Haifa Linux Club [haifux.org] about the subject.
  • by Kraken137 ( 15062 ) on Tuesday June 17, 2003 @07:47PM (#6227626) Homepage
  • by Bandito ( 134369 ) on Tuesday June 17, 2003 @07:50PM (#6227679)
    DSH [sourceforge.net]? I used it awhile back and was pretty happy with it.

    It was a bit unstable, but that was almost a year ago. Give it a try.
  • How about pdsh? (Score:4, Informative)

    by cinnerz ( 22046 ) on Tuesday June 17, 2003 @07:58PM (#6227763)
    PDSH [llnl.gov] works pretty well in my experience. It's pretty good to run commands on the nodes and pdcp can copy files out.
    • Pdsh doesnt yet fully support ssh. For command executiong as rsh's replacement, yes but for rcp no. The newest build of pdsh doesnt even compile right with --with-ssh without patching.
  • herdtools (Score:4, Informative)

    by MacJedi ( 173 ) on Tuesday June 17, 2003 @08:36PM (#6228076) Homepage
    herdtools [sourceforge.net]

    /joeyo

  • We have many extra Windows XP machines around here, which idle around most of the time.

    We needed some machines for running stress testing against our network servers, but we didn't have enough horse power to run a pure linux based clustering/distributed stress client.

    I looked around abit, like you, and found there wasn't much.

    Because of this I have written some hackish python code that basicly creates a cross platform distributed and self-updating cluster.

    We use it to run our cross platform stress test
    • Interestingly, that sort of thing seems to be what Python was invented for: it was the control language for the Amoeba [rit.edu] distributed OS, as described by Guido van Rossum himself [artima.com]:

      Guido van Rossum: In 1986 I moved to a different project at CWI, the Amoeba project. Amoeba was a distributed operating system. By the late 1980s we found we needed a scripting language. I had a large degree of freedom on that project to start my own mini project within the scope of what we were doing.

      I don't know if you knew ab

    • Sounds very interesting... are you by any chance using IPython as the shell? if not, it might worth a look, it basically makes the Python CLI a super-shell, so it sounds like you could insert your distributed code under the IPython layer and get a distributed super-shell :-)
  • During the Munich IETF 1997(?) I used rdist (part of Irix) to copy files from one machine to 40 others, as someone thought NFS was not an option.

    When I had a set of (permanently running...) Unix workstations last, I used sh for-loops and ssh to run commands.

    During another cluster project [feyrer.de] I was happy to use NFS to share files, and used rsh over ssh as it was ways faster.

    Oh, and if you ever need to render mpegs from jpegs, check out the UCB's excellent "mpeg_encode [berkeley.edu]", which does all the load balancing on a
  • by tellurian ( 90659 ) on Tuesday June 17, 2003 @10:09PM (#6228753)
    I like the Rocks Cluster Distribution [npaci.edu]. It is above all simple to use, well documented, and stable.

  • There's really not all that much to it... bundling some scripting in the language of your choice around parallel ssh session is a pretty decent solution that most people seem to arrive at.
  • Tivoli! (Score:1, Funny)

    by Anonymous Coward
    If you got a few million to spare, Tivoli does anything!
  • radmind (Score:3, Informative)

    by More Trouble ( 211162 ) on Wednesday June 18, 2003 @12:07AM (#6229679)
    You might try radmind [radmind.org]. It's used pretty popularly in the Mac OS X world, but was originally written for Solaris, Linux, and *BSD. There's a reasonably sized community using it, and a supportive mailing list.

    :w
  • SUN's grid engine (Score:4, Informative)

    by martin ( 1336 ) <maxsec.gmail@com> on Wednesday June 18, 2003 @07:08AM (#6231591) Journal
    is free for both solaris (Sparc & x86) and Linux..

    http://wwws.sun.com/software/gridware/sge_get.ht ml

    grid engines 'tend' to be more useful as they can balance the load better to non-dedicated hosts. Just my view, but saves building a dedicated cluster with all these 2ghz pentiums on the desktop..(assuming you have linux on the desktop of course)

    --
  • by abulafia ( 7826 ) on Wednesday June 18, 2003 @09:36AM (#6232551)
    CFengine rocks. It isn't a distributed shell, but for configuration management and remote automated changes, you can't beat it.
  • by kcurrie ( 4116 ) on Wednesday June 18, 2003 @10:32AM (#6233171)
    Where I work (a LARGE networking company that makes all kinds of networking hardware) a co-worker and I created multiple parallel SSH tools which enable you to run hundreds to thousands of concurrent outgoing sessions, depending on hardware. We have not yet had the cycles to look into open sourcing it, but hope to.

    I can share the basics of it here though, which should enable somebody else to easily build their own. On a day to day basis we needed to be able to run commands on 10,000+ Solaris and Linux boxes, and wanted to use SSH key authentication, but not keys with a null passphrase (as if the private key was stolen, major security implications present themselves :-) ) . The only way to do this (other than having some type of expect type program typing in the passphrase for you) is to use the ssh-agent. The problem with the ssh-agent is that is simply does not have the ability to authenticate more than say 20+ ssh sessions as once (depending on machine load, etc). What happens when too many ssh sessions attempt to authenticate against the ssh-agent is that you get many authentication failures due to timeouts. There are some hacks you can do to the ssh source code that will increase the number of times ssh will attempt to contact the agent, as well as the delay between attempts. We've done these hacks, but they still were nowhere near enough.
    The solution instead is to use MULTIPLE ssh-agents, and load balance between them. We wrote a tool that will prompt for our key passphrase and then load say 100 ssh-agents with that key loaded. When it starts the agents it records the variables SSH_AUTH_SOCK and SSH_AGENT_PID for each agent in a single file. We then have shell scripts wrapped around ssh commands that just randomly pick an agent to connect to, effectively load balancing.
    We run this whole thing on an OpenMosix cluster, which allows the ssh-agents and ssh processes to migrate across the machines once they start to use too much CPU time on their current node. We've found that Linux boxes seem to be much faster for SSH operations than Solaris (sparc) boxes, BTW.
    We have also written a parallel ssh tool that works similarly to others discussed here (and others NOT discussed here, like Ed Hill's clsh which in a previous life I used extensively), except our tool has a couple of other major features which (IMHO) are required in an enterprise environment. The biggest thing that we've found is that when working on boxes in the far reaches of the world, we cannot assume that any common group of NFS mounts will exist, or work properly when we need them to. If you cannot be sure what remote mounts are available, how can you run scripts on the remote box? This prompted us to make our program have the ability to both run perl code directly fed to it, as well as (basically) remotely deliver scripts for running and delete them afterwards. So if we've written an administrative script called foo.sh, our tool will basically pipe the script across a SSH session to the remote end and run it, usually never having to touch the remote disk at all. This is VERY useful because when talking about 10k+ boxes, many of which are desktops, you can never be sure which partitions will be full.

    Using our parallel ssh tool, along with the ssh-agent load balancing and a 3 node OpenMosix cluster we've been able to run 1000 outgoing ssh sessions without issues. This means if you want to change root passwords on 10k boxes it only takes slightly longer than changing passwords on 10 boxes. A real time saver, to say the least :-)

    Comments anyone?

    BTW, is anybody using any hacks of OpenSSH to work similarly to sudo for giving out root access?

  • ghosts (Score:2, Informative)

    by jmason ( 16123 )
    'ghosts' is a command which has been included with perl in the 'eg' directory since at least 4.036. It does this effectively, allowing you to do

    gsh somemachines somecommand

    or

    gcp somefile somemachines:/etc/newfile

    worked great, last time I had to admin a large network (about 5 years ago ;). *EXTREMELY* simple, too.

    http://outflux.net/unix/software/gsh/ seems to be an updating of this tool.
  • I'm a read fan of two programs:

    DSH is nice for relatively small things that need to get run everywhere, and has an interactive mode that works fairly well. It works from any command line.

    Pconsole on the other hand requires X, and creates a seperate terminal window for every host you are connecting to. There is a small 'command' window that echos everything you type to all of the other terminals under it's control. If you hav

THEGODDESSOFTHENETHASTWISTINGFINGERSANDHERVOICEISLIKEAJAVELININTHENIGHTDUDE

Working...