Minions - the affordable CPU farm

Post Reply
User avatar
mctom
Posts: 2744
Joined: Wed Nov 11, 2020 4:44 am
languages_spoken: english, polish
ODROIDs: OGA, XU4, C2, M1, H3+, SP3, Vu8M
Location: Gdansk, Poland
Has thanked: 368 times
Been thanked: 481 times
Contact:

Minions - the affordable CPU farm

Post by mctom »

So now since I got a handful of PCs to play with I have conducted a successful experiment!
Which is not entirely what I envisioned, but serves the same purpose - do stuff elsewhere, without modifying original programs.

tldr: I did "outrun" linked above, but manually.

So here's what I did, everything from a remote machine:
- sshfs my main PC's / to some directory on a remote machine,
- chroot into it and run some makefile from my local machine,
- profit. I built local project on a remote machine that doesn't even have gcc installed. It actually works...
This seems to be a known technique, albeit not really popular. https://unix.stackexchange.com/question ... hfs-folder

The method only works with chroot from coreutils, not the simplified version from busybox. The former has an option to chroot as a defined user, and not root.

Frankly, most other programs don't work as expected. Even configure preceeding a successful make didn't work, I had to run it locally first. mc apparently still thinks I'm root. htop segfaults, lol..
tmux has troubles creating a file in /tmp, ncdu works only if asked not to read config files (thinks I'm root again), and sl has a slightly bumpy ride across the screen.

I'll look around in outrun code what other tricks are up their sleeves. I remember that outrun does recreate environmental variables, which totally makes sense and could convince some programs that I'm not a root.

To me, that's an exciting development, or at least a learning opportunity. :D
Punk ain't no religious cult, punk means thinking for yourself!

Maintainer of PiStackMon

User avatar
mctom
Posts: 2744
Joined: Wed Nov 11, 2020 4:44 am
languages_spoken: english, polish
ODROIDs: OGA, XU4, C2, M1, H3+, SP3, Vu8M
Location: Gdansk, Poland
Has thanked: 368 times
Been thanked: 481 times
Contact:

Re: True Love - Ultimat resource sharing over LAN

Post by mctom »

It's alive!

After a week of battle, I made a script that spits out tailored dCore images, complete with a set of SSH keys for communication in both ways, setting up everything on startup and all.
This assumes two things: The machine, once it boots via PXE, is dedicated to serve my needs only, and since it has access to my filesystem, the PXE image must be kept secure. If compromised, one of .ssh/authorized_keys must be revoked.

I crafted a bash wrapper that sends a task to a minion machine via ssh, together with all environmental variables and some other stuff.
A minion chroots into my filesystem, provided via sshfs, and executes the command. It may fork or call other commands with no problem.

It actually works!

I ran the same task, a complete build of dropbear using make. The "minion" images have no tools installed and sourced them from my machine.
The invocation is exactly what one would expect: minion make -j8
VirtualBox: 2m18
Wyse 3040: 3m13
H3+ (host): 11s :D

So clearly there is some room for improvement...
I suspect file caching built into sshfs isn't working too well or at all. Most of the minion's CPU time is dedicated to sshfs process.
Repeating make on Wyse machine, but with -j16 decreased job time to 2m52, which suggests latency issues.

Picture below shows a VirtualBox VM running "gcc" processes on behalf of my workstation.
Attachments
2023-02-18-225542_2560x1440_scrot.png
2023-02-18-225542_2560x1440_scrot.png (634.03 KiB) Viewed 1181 times
Punk ain't no religious cult, punk means thinking for yourself!

Maintainer of PiStackMon

User avatar
rooted
Posts: 10037
Joined: Fri Dec 19, 2014 9:12 am
languages_spoken: english
Location: Gulf of Mexico, US
Has thanked: 788 times
Been thanked: 587 times
Contact:

Re: True Love - Ultimat resource sharing over LAN

Post by rooted »

Congratulations on the success this far. Always a bit of tuning that can be done.

Is there a reason not to use NFS vs SSHFS or is it not an option with your setup?

User avatar
mctom
Posts: 2744
Joined: Wed Nov 11, 2020 4:44 am
languages_spoken: english, polish
ODROIDs: OGA, XU4, C2, M1, H3+, SP3, Vu8M
Location: Gdansk, Poland
Has thanked: 368 times
Been thanked: 481 times
Contact:

Re: True Love - Ultimat resource sharing over LAN

Post by mctom »

rooted wrote:
Sun Feb 19, 2023 8:58 am
Congratulations on the success this far. Always a bit of tuning that can be done.

Is there a reason not to use NFS vs SSHFS or is it not an option with your setup?
As far as I'm concerned, NFS offers no authentication, and I'd be less than happy sharing my entire filesystem with no protection at all, even inside my LAN.
Also I'm not sure whether the file ownership and permission bits would be handled correctly.
And finally, I think the poor network performance is a latency problem that NFS probably would not solve.

The typical data transfer during a compile job was less than 1MB/s, but yet sshfs and ssh processes consumed the majority of CPU power on minion side.
Punk ain't no religious cult, punk means thinking for yourself!

Maintainer of PiStackMon

User avatar
rooted
Posts: 10037
Joined: Fri Dec 19, 2014 9:12 am
languages_spoken: english
Location: Gulf of Mexico, US
Has thanked: 788 times
Been thanked: 587 times
Contact:

Re: True Love - Ultimat resource sharing over LAN

Post by rooted »

Bi-way encryption is the cause of the high CPU usage I'm guessing which is why I mentioned NFS.

NFSv4 offers authentication.

User avatar
mctom
Posts: 2744
Joined: Wed Nov 11, 2020 4:44 am
languages_spoken: english, polish
ODROIDs: OGA, XU4, C2, M1, H3+, SP3, Vu8M
Location: Gdansk, Poland
Has thanked: 368 times
Been thanked: 481 times
Contact:

Re: True Love - Ultimat resource sharing over LAN

Post by mctom »

But is it some 30% of CPU time for 1MB/s transfer? It's hard to believe.
I have also chosen the weakest cipher still available in openSSH, that should work fine with Intel AES extensions.
I suspect what takes so much time is processing and parsing each filesystem call in some convoluted way.

But either way, a well working cache should improve this situation. I'll see what I can do with sshfs mounting options, and test whether cipher actually affects the CPU use in a visible way.
I'll keep it scientific and record my results for comparison.
Punk ain't no religious cult, punk means thinking for yourself!

Maintainer of PiStackMon

User avatar
mad_ady
Posts: 11322
Joined: Wed Jul 15, 2015 5:00 pm
languages_spoken: english
ODROIDs: XU4 (HC1, HC2), C1+, C2, C4 (HC4), N1, N2, H2, Go, Go Advance, M1
Location: Bucharest, Romania
Has thanked: 647 times
Been thanked: 1081 times
Contact:

Re: True Love - Ultimat resource sharing over LAN

Post by mad_ady »

You can measure ssh throughput with a scp transfer. When it hogs 100% of a core, throughput is affected.

NFS (even v3) has some security built-in - like sharing a specific share with select ips or ip ranges. Now, I don't know what your girlfriend is up to, but if she's into IP spoofing, like my wife,I'd use sshfs too.

User avatar
mctom
Posts: 2744
Joined: Wed Nov 11, 2020 4:44 am
languages_spoken: english, polish
ODROIDs: OGA, XU4, C2, M1, H3+, SP3, Vu8M
Location: Gdansk, Poland
Has thanked: 368 times
Been thanked: 481 times
Contact:

Re: True Love - Ultimat resource sharing over LAN

Post by mctom »

I logged into minion and used the very same sshfs connections to benchmark a file copy, using time cp <mounted remote fs> <local dir>.
a 518MB file was copied in 8.07s (64.2MB/s).
When repeated, it took 0.63s, so it was clearly cached (even though sshfs connection had no explicit parameters for that). Even "top" reports that the memory used for caching swelled.

Which brings me to the conclusion the cache is working (always has been), but even if, it takes some half a second to access it. Or at least make sure the file did not get updated.
When mounted with kernel_cache option (that assumes the files have not been changed externally), the transfer rate was 63MB/s and repeated access took 0.23s.

Copying back to the workstation has a random speed between 19-45MB/s.

Well I did my scientific study and tested some sshfs parameters, including cache_timeout, auto_cache, max_readahead, different ciphers and so on. Some 14 benchmarks later, no matter what options I used, the build time was between 3m10s and 3m13s.

So I think the next logical step is to see if NFS performs any better?
My GF probably won't be spoofing IPs and I'm okay with merely limiting the IP range.
After all I could use a second LAN adapter on H3+ to create an internal network just for that.

EDIT:given it a second thought,, only my /home and /etc are sacred, so I guess sharing /usr read-only isn't a security threat really.
Punk ain't no religious cult, punk means thinking for yourself!

Maintainer of PiStackMon

User avatar
mad_ady
Posts: 11322
Joined: Wed Jul 15, 2015 5:00 pm
languages_spoken: english
ODROIDs: XU4 (HC1, HC2), C1+, C2, C4 (HC4), N1, N2, H2, Go, Go Advance, M1
Location: Bucharest, Romania
Has thanked: 647 times
Been thanked: 1081 times
Contact:

Re: True Love - Ultimat resource sharing over LAN

Post by mad_ady »

After all I could use a second LAN adapter on H3+ to create an internal network just for that.
Or VLANs on the same adapter/physical network (in case you were looking for a new rabbit hole to go through). Or just a secondary IP range (you can configure multiple ipv4s on the same interface), and assign from a different range on your dhcp server just for the minions. Though a snooping attacker would see that and could just add a static ip for that range... I'm sorry, I've been into network security for so long, I see mainly the holes...
So I think the next logical step is to see if NFS performs any better?
NFS should fill all the available bandwidth (1Gbps). Whether that influences anything or not, I don't know.

User avatar
mctom
Posts: 2744
Joined: Wed Nov 11, 2020 4:44 am
languages_spoken: english, polish
ODROIDs: OGA, XU4, C2, M1, H3+, SP3, Vu8M
Location: Gdansk, Poland
Has thanked: 368 times
Been thanked: 481 times
Contact:

Re: True Love - Ultimat resource sharing over LAN

Post by mctom »

That's okay, I keep enthusiasts in high regard, no matter if it's circuit design or network security. Thanks for your input!

So I have set up nfs share with my /usr, read only, and limited to minion[0-9] hostnames for now. And I mounted it on top of my / shared via sshfs.
The build time dropped from average 191 seconds to 110 :shock:

I'm really not comfy sharing my /home, so instead I shared a subdirectory in /tmp. As before I mounted it over sshfs filesystem on the minion.
I placed my make job in there and what do you know.. 51s! :o

I'm pretty sure that's the performance limit of this setup. Previously I timed local builds on Minion and H3+, and they turned out roughly 3.7x slower (~40s). Now achieving 50s over the network I'm more than happy.

I'll spend some more time replacing the sshfs with nfs altogether. This will also simplify my minion image. But I'll have to learn more about nfs security. Why won't it just work like sshfs dammit.
Or maybe I'll go all in and disregard all security and just set it all up using a second LAN adapter in H3+.. What else could I use it for after all. :)

Is there any trick that would help me share entire filesystem via nfs, with read and write privileges for each file and directory as seen from the perspective of a specific user?

----
By the way I've done another experiment: I added gcc to my ~/.local/bin, with a bash script that calls minion to do the gcc job. Then I could run make locally and the gcc part got executed externally. The performance was comparable (albeit slightly slower, due to dropbear connection limit I haven't compiled in just yet).
So the user will be free to choose which executables should go to the minion cluster and it will happen transparently. How cool is that!
Punk ain't no religious cult, punk means thinking for yourself!

Maintainer of PiStackMon

User avatar
rooted
Posts: 10037
Joined: Fri Dec 19, 2014 9:12 am
languages_spoken: english
Location: Gulf of Mexico, US
Has thanked: 788 times
Been thanked: 587 times
Contact:

Re: True Love - Ultimat resource sharing over LAN

Post by rooted »

That is quite an improvement, very nice.

User avatar
mctom
Posts: 2744
Joined: Wed Nov 11, 2020 4:44 am
languages_spoken: english, polish
ODROIDs: OGA, XU4, C2, M1, H3+, SP3, Vu8M
Location: Gdansk, Poland
Has thanked: 368 times
Been thanked: 481 times
Contact:

Re: True Love - Ultimat resource sharing over LAN

Post by mctom »

I'm more excited than usual, I admit. :)
It works, and it works quite damn well.
Now I need some plastic spacers to cluster 7 units together, that's the last bit missing for the physical build.
Punk ain't no religious cult, punk means thinking for yourself!

Maintainer of PiStackMon

User avatar
mad_ady
Posts: 11322
Joined: Wed Jul 15, 2015 5:00 pm
languages_spoken: english
ODROIDs: XU4 (HC1, HC2), C1+, C2, C4 (HC4), N1, N2, H2, Go, Go Advance, M1
Location: Bucharest, Romania
Has thanked: 647 times
Been thanked: 1081 times
Contact:

Re: True Love - Ultimat resource sharing over LAN

Post by mad_ady »

Is there any trick that would help me share entire filesystem via nfs, with read and write privileges for each file and directory as seen from the perspective of a specific user?
I haven't used nfsv4, but in nfsv3 file permissions are exposed directly to the client, and the clients sees the same uids and gids the server sees (and maps them to names locally). So, if on the server you have a file with permissions 0754 owned by mctom(uid 1001):mctom(gid 1005), on the client you'll see permissions 0754, for a file owned by uid 1001:1005. The names of these uid/gid will be local to the client (whatever it has in its /etc/passwd, /etc/group). So, part of your image building process might benefit from duplicating your users/groups from H3...

User avatar
mctom
Posts: 2744
Joined: Wed Nov 11, 2020 4:44 am
languages_spoken: english, polish
ODROIDs: OGA, XU4, C2, M1, H3+, SP3, Vu8M
Location: Gdansk, Poland
Has thanked: 368 times
Been thanked: 481 times
Contact:

Re: True Love - Ultimat resource sharing over LAN

Post by mctom »

Excellent! I have already recreated user 1000:1000 with matching uid and gid, otherwise nothing worked in chroot. :)
I rolled another image with nfs everywhere instead of sshfs. The size is similar again, some 55MB. And it works as expected. :)

The / export had some directory contents missing, namely /dev, /proc, /sys and /tmp were empty.
I added mount --bind /dev /remote/dev so scripts stop complaining there's no /dev/null.
/tmp is somewhat important (make uses it) so I dug through Google only to find out it's a bug in nfs (mounting tempfs in general). The workaround is to export it separately, with fsid=whatever defined.

So I guess the image is ready for now. It boots, and after some 4 minutes it's ready to accept tasks.
Now I'll focus on assembling a physical cluster, and then we'll experiment with distributing the load. How exciting!

@mad_ady, a question just for you: Is this a bad idea to broadcast magic UDP packets every 10-50ms? Is this a substantial load for networking equipment?
Punk ain't no religious cult, punk means thinking for yourself!

Maintainer of PiStackMon

User avatar
mad_ady
Posts: 11322
Joined: Wed Jul 15, 2015 5:00 pm
languages_spoken: english
ODROIDs: XU4 (HC1, HC2), C1+, C2, C4 (HC4), N1, N2, H2, Go, Go Advance, M1
Location: Bucharest, Romania
Has thanked: 647 times
Been thanked: 1081 times
Contact:

Re: True Love - Ultimat resource sharing over LAN

Post by mad_ady »

NFS, by default exposes only one filesystem, not mountpoints above it, like /dev, /mnt, etc. To expose those, add the crossmnt option in your /etc/exports, next to the desired export.

A packet every 10ms leads to 100pps of traffic. That's not a problem for modern networking equipment. For example, 1Gbps traffic of full (1500 byte) packets produces ~80kpps. But if it's wol traffic, that goes to your broadcast address and is usually processed in the cpu by all hosts in that broadcast domain. So expect 0.0x% extra cpu usage, an extra 100 IRQ in each device.
These users thanked the author mad_ady for the post:
mctom (Mon Feb 20, 2023 4:34 pm)

User avatar
mctom
Posts: 2744
Joined: Wed Nov 11, 2020 4:44 am
languages_spoken: english, polish
ODROIDs: OGA, XU4, C2, M1, H3+, SP3, Vu8M
Location: Gdansk, Poland
Has thanked: 368 times
Been thanked: 481 times
Contact:

Re: True Love - Ultimat resource sharing over LAN

Post by mctom »

So I mixed up two things - it was correct for NFS to not expose mounts, but if one wishes to mount tmpfs, there was a bug at some point. Not sure if my version is affected. :)
That one problem was quite hard for me to crack, and if something is hard to solve, that usually means it's a combinations of two or more problems.
Anyway thanks again, it's unimaginable how much I've learned last week. :)

I wanted Minions to broadcast information that they are available and not loaded with work, so others may pick a random worker to take over a task (the first broadcast they receive).
It's worth noting that minions may spawn tasks across each other as well, so everyone is to be kept informed.
This solution sucks because it's not scalable - if I had 1000 minions the network would be loaded with these broadcasts. I had no reference for the capabilities of modern networking.
Even if minion daemons were smart enough to not spam the network if it's already spammed, and keep the overall broadcast density at 100Hz, that still introduces an inevitable delay, just waiting for some minion to broadcast a "bleep".

I just figured I may let minions create a file in /tmp instead. ha.. And a timestamp inside, just in case some minion dies without removing the file.
Adjusting clocks is already one of the startup steps for minions. A quarter of the image size just for ntpdate and dependencies :D
Punk ain't no religious cult, punk means thinking for yourself!

Maintainer of PiStackMon

User avatar
rooted
Posts: 10037
Joined: Fri Dec 19, 2014 9:12 am
languages_spoken: english
Location: Gulf of Mexico, US
Has thanked: 788 times
Been thanked: 587 times
Contact:

Re: True Love - Ultimat resource sharing over LAN

Post by rooted »

Shouldn't the system you are using for task scheduling handle node availability and spin up (wake) a sleeping node automatically?

User avatar
mad_ady
Posts: 11322
Joined: Wed Jul 15, 2015 5:00 pm
languages_spoken: english
ODROIDs: XU4 (HC1, HC2), C1+, C2, C4 (HC4), N1, N2, H2, Go, Go Advance, M1
Location: Bucharest, Romania
Has thanked: 647 times
Been thanked: 1081 times
Contact:

Re: True Love - Ultimat resource sharing over LAN

Post by mad_ady »

rooted wrote:Shouldn't the system you are using for task scheduling handle node availability and spin up (wake) a sleeping node automatically?
No need for that! I assume mctom's workers compile stuff 247 Image

Regarding availability - there are multiple tools (I'd imagine) that allow the workers to communicate their availability. I wouldn't use broadcast because it's wasteful for all the nodes, but I'd use unicast - all slaves report to the master, or via a message brokering mechanism - like mqtt (or apache kafka), where slaves (and masters) subscribe to various topics, and slaves can report telemwtry/receive commands, and other interested parties can consume these messages.

User avatar
mctom
Posts: 2744
Joined: Wed Nov 11, 2020 4:44 am
languages_spoken: english, polish
ODROIDs: OGA, XU4, C2, M1, H3+, SP3, Vu8M
Location: Gdansk, Poland
Has thanked: 368 times
Been thanked: 481 times
Contact:

Re: True Love - Ultimat resource sharing over LAN

Post by mctom »

Well, the PXE boot and initial setup takes anything between 3 and 4 minutes. UEFI is to blame, so they say, the image download speed is awful. Then tinycore spends some time discovering network adapters and so on. The script waits for the ping to my workstation to succeed before it mounts drives and launches ssh server.

I haven't considered putting nodes to sleep, got to test that. The question is how long would it take and whether NFS mounts would persist. I think they should.
And what would it take to wake them up?
I tested wake on LAN before and it didn't work on Wyse, so that would require some more digging. It should support this functionality and I turned it on in each unit's BIOS.

Considering all that, I was thinking about a manual switch to power up all nodes (they are all set up to power on when "AC" comes back), when I anticipate yay update or a big simulation job. I think this will work better than automatic setup that spins up my minions and waits 4 minutes before it's done. ;)
mad_ady wrote:
Mon Feb 20, 2023 6:47 pm
Regarding availability - there are multiple tools (I'd imagine) that allow the workers to communicate their availability. I wouldn't use broadcast because it's wasteful for all the nodes, but I'd use unicast - all slaves report to the master, or via a message brokering mechanism - like mqtt (or apache kafka), where slaves (and masters) subscribe to various topics, and slaves can report telemwtry/receive commands, and other interested parties can consume these messages.
You know what, I'll just create those files in /tmp like it was 1997 and see if it works first. :) This approach requires no additional tools than a small program that I can put together from pistackmon daemon - the CPU monitor.
The idea is that each minion will create a file, like /tmp/minion/.<hostname> with a Unix timestamp in it. And it will refresh it every 5 seconds, or delete it if it's busy, or gracefully shut down.
Any other party that wants to delegate work "anywhere else" will pick a random file from /tmp/minion, check if timestamp is not older than 10 seconds, and launch a job there.
If the timestamp is expired, every node can clean it up at this point.
What could possibly go wrong? :lol:

IF I manage to beat the time of a local build of dropbear on H3+ (11s) with this thing, I'm going to high five myself so hard... :D
Punk ain't no religious cult, punk means thinking for yourself!

Maintainer of PiStackMon

User avatar
mad_ady
Posts: 11322
Joined: Wed Jul 15, 2015 5:00 pm
languages_spoken: english
ODROIDs: XU4 (HC1, HC2), C1+, C2, C4 (HC4), N1, N2, H2, Go, Go Advance, M1
Location: Bucharest, Romania
Has thanked: 647 times
Been thanked: 1081 times
Contact:

Re: True Love - Ultimat resource sharing over LAN

Post by mad_ady »

IF I manage to beat the time of a local build of dropbear on H3+ (11s) with this thing...
Isn't it like that joke with 9 women giving birth to a child in a month?

Assuming that you want to make -j $mininons and have a gcc wrapper that starts on an empty node, there will still be dependencies - having to wait for other build objects. Maybe you won't feel it for dropbear, but for heavier things like kernel/firefox/chrome it's crazy enough to work.
And I wouldn't use $minions, but $minions * $number_of_cores. Why keep idle cores?

Also, to speed up boot time, I'd look into suspend to ram. Suspend to disk needs a swap partition the size of your ram, and you still need to load kernel/initramfs from the network...

For some BIOSes, wol doesn't work when the board was powered off, but works when put into S3 mode.

User avatar
mctom
Posts: 2744
Joined: Wed Nov 11, 2020 4:44 am
languages_spoken: english, polish
ODROIDs: OGA, XU4, C2, M1, H3+, SP3, Vu8M
Location: Gdansk, Poland
Has thanked: 368 times
Been thanked: 481 times
Contact:

Re: True Love - Ultimat resource sharing over LAN

Post by mctom »

Which brings us to the reason "why not distcc".
In modern build jobs, such as my FPGA tools (icestorm, nextpnr, yosys), the vast majority of "build time" is not spent in gcc - in this example, it's mostly a loveton of Python scripts doing something.
distcc also assumes that all machines use the same gcc version - which is hard to guarantee on different systems in its own right.

So yeah, I want to believe that running make -j28 will make a difference - we'll see about that. You're right maybe dropbear won't be the best benchmark, that's why I'll be amazed if it actually beats the 11s build time of dropbear on H3+. I doubt that :roll:

But to be honest, the major reason to build this cluster is to perform massive amounts of ngspice simulations, that I'm sure will outperform any single machine.
Hah, might as well test the whole cluster with boinc. Run 30 jobs at once, why not.

Those Wyse machines, after I figured out how to disable their 2W power cap, can work in turbo boost with 2 cores stressed, or at base frequency on 4 cores. So I guess I'll use 4 cores, and flag them as busy with CPU > 80%.

I'd rather not use the internal storage of Wyse at all - the reason being, if I accidentally boot some other machine with PXE, I don't want to nuke it. :D

But I'll experiment with S3 mode today, as you say. I haven't tried that yet. If the minion draws reasonable amount of power in S3 mode (say <0.5W) I could let them stay in that mode all the time, and wake up on demand. They idle at 2.5W and draw up to 5W at full load.
Punk ain't no religious cult, punk means thinking for yourself!

Maintainer of PiStackMon

User avatar
mctom
Posts: 2744
Joined: Wed Nov 11, 2020 4:44 am
languages_spoken: english, polish
ODROIDs: OGA, XU4, C2, M1, H3+, SP3, Vu8M
Location: Gdansk, Poland
Has thanked: 368 times
Been thanked: 481 times
Contact:

Re: True Love - Ultimat resource sharing over LAN

Post by mctom »

Nope, WOL doesn't work in suspend mode either, even after (I think?) I turned it on in both BIOS and Linux.
And it's not worth it really - 1.8W in suspend, compared to 2.2W idle.
All tunables in "powertop" gave no visible result in power consumption.
So I think I'll use a relay to turn the cluster on and off. :D
Punk ain't no religious cult, punk means thinking for yourself!

Maintainer of PiStackMon

User avatar
rooted
Posts: 10037
Joined: Fri Dec 19, 2014 9:12 am
languages_spoken: english
Location: Gulf of Mexico, US
Has thanked: 788 times
Been thanked: 587 times
Contact:

Re: True Love - Ultimat resource sharing over LAN

Post by rooted »

Reasonable fix for no WoL, if you don't mind me asking what did you pay for the lot of them?

I looked around here in the States and I saw a lot of 10 for $180 but that was without power supplies.

User avatar
mctom
Posts: 2744
Joined: Wed Nov 11, 2020 4:44 am
languages_spoken: english, polish
ODROIDs: OGA, XU4, C2, M1, H3+, SP3, Vu8M
Location: Gdansk, Poland
Has thanked: 368 times
Been thanked: 481 times
Contact:

Re: True Love - Ultimat resource sharing over LAN

Post by mctom »

In situations like that I often buy a few more items and sell them afterwards with a profit to offset the cost. I bought 12 pieces, sold 3 so far, so in the end I paid some $128 for 9 units, including shipping from Denmark.
Those were without power supplies but I don't need them. I have a 5V 30A industrial PSU from my old RasPi cluster build, and a voltage distribution board from the same project.
I had to buy supply cables, because those Wyse machines have weirdly sized barrel jack input.
Two units had dead CMOS batteries, that had to be replaced, otherwise got stuck on boot. It's CR2032 on a cable, pretty much like in ODROIDs, if not the same.
No other issues have been found, except they need fresh thermal paste under the heatsink (passive cooling for 4 years). I used cheap thermal pads instead.

The same seller has dropped another batch of some 100+ units on an auction site 2 days ago, I set up an auction sniper to try and get more if their price is below $10. I don't have high hopes, but if I win another 10 then why not run make -j80 :D
Still a far better deal than PCIe Xeon Phi that I almost bought. Fortunately I didn't because I had no machine I could plug it to.
Punk ain't no religious cult, punk means thinking for yourself!

Maintainer of PiStackMon

User avatar
mctom
Posts: 2744
Joined: Wed Nov 11, 2020 4:44 am
languages_spoken: english, polish
ODROIDs: OGA, XU4, C2, M1, H3+, SP3, Vu8M
Location: Gdansk, Poland
Has thanked: 368 times
Been thanked: 481 times
Contact:

Re: True Love - Ultimat resource sharing over LAN

Post by mctom »

mad_ady wrote:
Mon Feb 20, 2023 4:15 pm
NFS, by default exposes only one filesystem, not mountpoints above it, like /dev, /mnt, etc. To expose those, add the crossmnt option in your /etc/exports, next to the desired export.
I guess I am too dumb for this stuff. It worked, and then stopped working. /tmp existed but I couldn't cd into it.
In the end I exported / and /tmp separately, without crossmnt, but for any reason, now when I mount / the /tmp is available...
Doesn't matter, it works as it is. :)

Anyway!

I created a daemon, appropriately called hypothalamus :lol: after a part of a brain that, among other things, controls food intake.
It monitors CPU activity and signals it by creating files in my workstation's /tmp/minions path.
the naming convention is <hostname>.<num>, and the contents are UNIX timestamp, refreshed periodically.
It creates a few files with consecutive numbers, indicating how many CPU cores are not saturated yet. So, for 4-core system, it will maintain 0-4 files.
Additionally it will remove all of the files if the threshold of 80% is reached.

This will allow me to implement a quasi load balancing mechanism. My workstation and minions, willing to delegate a task somewhere else, will pick one file at random, thus it will be less likely to pick a machine that is already loaded. If no files are available, the machine will do the task itself.
Timestamps will be verified upon selection to make sure that's not a leftover from abruptly shut down minion.

Additionally, a super simple mproc script, as opposed to nproc, returns a number of cores available locally AND among the minions - which will be a handy argument for make. The number of minion cores is estimated based on the number of files in /tmp/minions.

Now what seems to be the final stage, is to expand my minion bash wrapper to delegate tasks to random minions, rather than just one I have here, begging for mercy.
AND document all that stuff before I forget what I've done. :D

So I'd use some bash whiz advice here. I noticed that when i Ctrl+C out from my minion wrapper, the task goes on on Minion and I lose control over it. Is there any way to improve this script?
Somewhere in that ssh-chroot-bash matryoshka the underlying process does not get killed.

Code: Select all

#!/bin/bash

ENVS=$(env | sed 's/=/="/;s/$/"/' | tr '\n' ' ')

ssh root@minion1 chroot --userspec 1000:1000 /remote "/bin/bash -c 'cd $PWD; $ENVS $@'"
Punk ain't no religious cult, punk means thinking for yourself!

Maintainer of PiStackMon

User avatar
mad_ady
Posts: 11322
Joined: Wed Jul 15, 2015 5:00 pm
languages_spoken: english
ODROIDs: XU4 (HC1, HC2), C1+, C2, C4 (HC4), N1, N2, H2, Go, Go Advance, M1
Location: Bucharest, Romania
Has thanked: 647 times
Been thanked: 1081 times
Contact:

Re: True Love - Ultimat resource sharing over LAN

Post by mad_ady »


User avatar
mctom
Posts: 2744
Joined: Wed Nov 11, 2020 4:44 am
languages_spoken: english, polish
ODROIDs: OGA, XU4, C2, M1, H3+, SP3, Vu8M
Location: Gdansk, Poland
Has thanked: 368 times
Been thanked: 481 times
Contact:

Re: True Love - Ultimat resource sharing over LAN

Post by mctom »

mad_ady wrote:
Wed Feb 22, 2023 5:27 pm
See if this helps: https://stackoverflow.com/questions/443 ... ut-t-optio
so ssh -t should help. Cool, I'll check that out when I get a chance.

I'm thinking about occasions when a parent process actually tries to kill its offspring for whatever reason (does that happen?). It will kill one of the matryoshka parts instead.

In the meantime I learned that dropbear can be built as chroot jail easily: https://www.howtoforge.com/chrooted-drop-bear-howto
That would reduce matryoshka to just the ssh session, with no bash or chroot part.
Perhaps I won't even have to create dummy accounts for the users.. And it will actually use the ssh keys I have on my machine? We'll find out.
Punk ain't no religious cult, punk means thinking for yourself!

Maintainer of PiStackMon

User avatar
mctom
Posts: 2744
Joined: Wed Nov 11, 2020 4:44 am
languages_spoken: english, polish
ODROIDs: OGA, XU4, C2, M1, H3+, SP3, Vu8M
Location: Gdansk, Poland
Has thanked: 368 times
Been thanked: 481 times
Contact:

Re: True Love - Ultimat resource sharing over LAN

Post by mctom »

ssh -t helps. :) Thank you @mad_ady.

So I wanted to share some exciting developments!

- iPXE can be built with nfs support, so boot time has been reduced by about 1 minute. And I won't have to set up a separate http server for the OS image.
- I can run dropbear on a minion, by chrooting into my machine. So it doesn't have to be included in the OS image, but also! It uses my workstation's .ssh directories for authentication, so as long as mctom may log in as mctom, using his own passwordless key, it just works. There's no need for dummy users on a minion anymore.
Also, the minion wrapper for launching programs doesn't use chroot anymore, so the OS image for minions don't need coreutils. In fact, it doesn't need any extra packages at this point, the sole modification to vanilla dCore is adding a script that pulls more startup scripts from nfs share.

It works so well now I can even run mc on a minion and browse my local filesystem.
Ctrl+C out from make works as expected.

I spend this day reorganizing everything into neat scripts, and perhaps a makefile in the end, we'll see.

I set up three minions from my stash so I can begin experimenting with multiple minions and load balancing.
And I won 5 more Wyse machines. :D Cheaper than last time.
Punk ain't no religious cult, punk means thinking for yourself!

Maintainer of PiStackMon

User avatar
mctom
Posts: 2744
Joined: Wed Nov 11, 2020 4:44 am
languages_spoken: english, polish
ODROIDs: OGA, XU4, C2, M1, H3+, SP3, Vu8M
Location: Gdansk, Poland
Has thanked: 368 times
Been thanked: 481 times
Contact:

Re: True Love - Ultimat resource sharing over LAN

Post by mctom »

Hmm HOWEVER there is one problem I can't crack. The SSH authentication takes much longer, anything between 400ms up to 15 seconds. That's not good. And it's random.
This only happens when I ssh into a dropbear that runs on virtual box, chrooted straight from my H3+.
If I run dropbear directly on virtualbox, the delay is well below 100ms.
Connecting to localhost (openssh server) on H3+ takes about 300ms.
Connecting to localhost (dropbear server) is closer to 100ms.

one thing to note, even when chrooted, a local /dev is used, not remote.
Punk ain't no religious cult, punk means thinking for yourself!

Maintainer of PiStackMon

User avatar
mad_ady
Posts: 11322
Joined: Wed Jul 15, 2015 5:00 pm
languages_spoken: english
ODROIDs: XU4 (HC1, HC2), C1+, C2, C4 (HC4), N1, N2, H2, Go, Go Advance, M1
Location: Bucharest, Romania
Has thanked: 647 times
Been thanked: 1081 times
Contact:

Re: True Love - Ultimat resource sharing over LAN

Post by mad_ady »

Hmm, long delays - like 15-20s can be caused by DNS, of all things! Openssh (don't know about dropbear) wants to do a reverse lookup on the client IP, so it can write the dns name in the log. To rule this out, how about adding your h3 and minions IP/names in /etc/hosts, so it doesn't have to ask the DNS server?

If this doesn't improve, you may need to run ssh -vvv or run it through strace, to see what's happening when it's waiting for something.
These users thanked the author mad_ady for the post:
mctom (Mon Feb 27, 2023 6:20 am)

User avatar
mctom
Posts: 2744
Joined: Wed Nov 11, 2020 4:44 am
languages_spoken: english, polish
ODROIDs: OGA, XU4, C2, M1, H3+, SP3, Vu8M
Location: Gdansk, Poland
Has thanked: 368 times
Been thanked: 481 times
Contact:

Re: True Love - Ultimat resource sharing over LAN

Post by mctom »

I have already disabled all forms of logging, reverse DNS lookup is disabled in dropbear by default.

I conducted some tests with ssh -vvv and tapped enter key repeatedly to capture where it paused for longer. There are two spots actually. I tried googling information about this and found someone else reporting the same issues and a bunch of not helpful armchair professors.

Anyway, here's a sample log I just captured. That's my first attempt of logging in after the virtualbox machine has started.
I tried supplying the correct key so it tries it in the first place and not waste the time trying other keys.

Code: Select all

[mctom@Tomusiomat ~]$ time ssh -Tvvvv virtualbox -i ~/.ssh/id_rsa_passwordless.key -p 11122 whoami
OpenSSH_9.1p1, OpenSSL 3.0.7 1 Nov 2022
debug1: Reading configuration data /home/mctom/.ssh/config
debug1: /home/mctom/.ssh/config line 1: Applying options for *
debug1: Reading configuration data /etc/ssh/ssh_config
debug3: expanded UserKnownHostsFile '~/.ssh/known_hosts' -> '/home/mctom/.ssh/known_hosts'
debug3: expanded UserKnownHostsFile '~/.ssh/known_hosts2' -> '/home/mctom/.ssh/known_hosts2'
debug2: resolving "virtualbox" port 11122
debug3: resolve_host: lookup virtualbox:11122
debug3: ssh_connect_direct: entering
debug1: Connecting to virtualbox [192.168.0.254] port 11122.
debug3: set_sock_tos: set socket 3 IP_TOS 0x48
debug1: Connection established.
debug1: identity file /home/mctom/.ssh/id_rsa_passwordless.key type 0
debug1: identity file /home/mctom/.ssh/id_rsa_passwordless.key-cert type -1
debug1: identity file /home/mctom/.ssh/id_rsa_passwordless.key type 0
debug1: identity file /home/mctom/.ssh/id_rsa_passwordless.key-cert type -1
debug1: identity file /home/mctom/.ssh/id_rsa type 0
debug1: identity file /home/mctom/.ssh/id_rsa-cert type -1
debug1: identity file /home/mctom/.ssh/id_ed25519_passwordless type 3
debug1: identity file /home/mctom/.ssh/id_ed25519_passwordless-cert type -1
debug1: Local version string SSH-2.0-OpenSSH_9.1
debug1: Remote protocol version 2.0, remote software version dropbear_2022.83
debug1: compat_banner: no match: dropbear_2022.83
debug2: fd 3 setting O_NONBLOCK
debug1: Authenticating to virtualbox:11122 as 'mctom'
debug3: put_host_port: [virtualbox]:11122
debug3: record_hostkey: found key type RSA in file /home/mctom/.ssh/known_hosts:29
debug3: load_hostkeys_file: loaded 1 keys from [virtualbox]:11122
debug1: load_hostkeys: fopen /home/mctom/.ssh/known_hosts2: No such file or directory
debug1: load_hostkeys: fopen /etc/ssh/ssh_known_hosts: No such file or directory
debug1: load_hostkeys: fopen /etc/ssh/ssh_known_hosts2: No such file or directory
debug3: order_hostkeyalgs: prefer hostkeyalgs: rsa-sha2-512-cert-v01@openssh.com,rsa-sha2-256-cert-v01@openssh.com,rsa-sha2-512,rsa-sha2-256
debug3: send packet: type 20
debug1: SSH2_MSG_KEXINIT sent
debug3: receive packet: type 20
debug1: SSH2_MSG_KEXINIT received
debug2: local client KEXINIT proposal
debug2: KEX algorithms: sntrup761x25519-sha512@openssh.com,curve25519-sha256,curve25519-sha256@libssh.org,ecdh-sha2-nistp256,ecdh-sha2-nistp384,ecdh-sha2-nistp521,diffie-hellman-group-exchange-sha256,diffie-hellman-group16-sha512,diffie-hellman-group18-sha512,diffie-hellman-group14-sha256,ext-info-c
debug2: host key algorithms: rsa-sha2-512-cert-v01@openssh.com,rsa-sha2-256-cert-v01@openssh.com,rsa-sha2-512,rsa-sha2-256,ssh-ed25519-cert-v01@openssh.com,ecdsa-sha2-nistp256-cert-v01@openssh.com,ecdsa-sha2-nistp384-cert-v01@openssh.com,ecdsa-sha2-nistp521-cert-v01@openssh.com,sk-ssh-ed25519-cert-v01@openssh.com,sk-ecdsa-sha2-nistp256-cert-v01@openssh.com,ssh-ed25519,ecdsa-sha2-nistp256,ecdsa-sha2-nistp384,ecdsa-sha2-nistp521,sk-ssh-ed25519@openssh.com,sk-ecdsa-sha2-nistp256@openssh.com
debug2: ciphers ctos: chacha20-poly1305@openssh.com,aes128-ctr,aes192-ctr,aes256-ctr,aes128-gcm@openssh.com,aes256-gcm@openssh.com
debug2: ciphers stoc: chacha20-poly1305@openssh.com,aes128-ctr,aes192-ctr,aes256-ctr,aes128-gcm@openssh.com,aes256-gcm@openssh.com
debug2: MACs ctos: umac-64-etm@openssh.com,umac-128-etm@openssh.com,hmac-sha2-256-etm@openssh.com,hmac-sha2-512-etm@openssh.com,hmac-sha1-etm@openssh.com,umac-64@openssh.com,umac-128@openssh.com,hmac-sha2-256,hmac-sha2-512,hmac-sha1
debug2: MACs stoc: umac-64-etm@openssh.com,umac-128-etm@openssh.com,hmac-sha2-256-etm@openssh.com,hmac-sha2-512-etm@openssh.com,hmac-sha1-etm@openssh.com,umac-64@openssh.com,umac-128@openssh.com,hmac-sha2-256,hmac-sha2-512,hmac-sha1
debug2: compression ctos: none,zlib@openssh.com,zlib
debug2: compression stoc: none,zlib@openssh.com,zlib
debug2: languages ctos: 
debug2: languages stoc: 
debug2: first_kex_follows 0 
debug2: reserved 0 
debug2: peer server KEXINIT proposal
debug2: KEX algorithms: curve25519-sha256,curve25519-sha256@libssh.org,ecdh-sha2-nistp521,ecdh-sha2-nistp384,ecdh-sha2-nistp256,diffie-hellman-group14-sha256,diffie-hellman-group14-sha1,kexguess2@matt.ucc.asn.au
debug2: host key algorithms: rsa-sha2-256,ssh-rsa
debug2: ciphers ctos: chacha20-poly1305@openssh.com,aes128-ctr,aes256-ctr,aes128-cbc,aes256-cbc
debug2: ciphers stoc: chacha20-poly1305@openssh.com,aes128-ctr,aes256-ctr,aes128-cbc,aes256-cbc
debug2: MACs ctos: hmac-sha1,hmac-sha2-256
debug2: MACs stoc: hmac-sha1,hmac-sha2-256
debug2: compression ctos: zlib@openssh.com,none
debug2: compression stoc: zlib@openssh.com,none
debug2: languages ctos: 
debug2: languages stoc: 
debug2: first_kex_follows 0 
debug2: reserved 0 
debug1: kex: algorithm: curve25519-sha256
debug1: kex: host key algorithm: rsa-sha2-256
debug1: kex: server->client cipher: chacha20-poly1305@openssh.com MAC: <implicit> compression: none
debug1: kex: client->server cipher: chacha20-poly1305@openssh.com MAC: <implicit> compression: none
debug3: send packet: type 30
debug1: expecting SSH2_MSG_KEX_ECDH_REPLY
debug3: receive packet: type 31
debug1: SSH2_MSG_KEX_ECDH_REPLY received
debug1: Server host key: ssh-rsa SHA256:By8HIV8cnKChg0Q80nMuijqvFMGWO7JCPzW4q1eaw0o
debug3: put_host_port: [192.168.0.254]:11122
debug3: put_host_port: [virtualbox]:11122
debug3: record_hostkey: found key type RSA in file /home/mctom/.ssh/known_hosts:29
debug3: load_hostkeys_file: loaded 1 keys from [virtualbox]:11122
debug1: load_hostkeys: fopen /home/mctom/.ssh/known_hosts2: No such file or directory
debug1: load_hostkeys: fopen /etc/ssh/ssh_known_hosts: No such file or directory
debug1: load_hostkeys: fopen /etc/ssh/ssh_known_hosts2: No such file or directory
debug1: Host '[virtualbox]:11122' is known and matches the RSA host key.
debug1: Found key in /home/mctom/.ssh/known_hosts:29
debug3: send packet: type 21
debug2: ssh_set_newkeys: mode 1
debug1: rekey out after 134217728 blocks
debug1: SSH2_MSG_NEWKEYS sent
debug1: expecting SSH2_MSG_NEWKEYS
debug3: receive packet: type 21
debug1: SSH2_MSG_NEWKEYS received
debug2: ssh_set_newkeys: mode 0
debug1: rekey in after 134217728 blocks
debug1: Will attempt key: /home/mctom/.ssh/id_rsa_passwordless.key RSA SHA256:fieUWhBcGSuGQmS2BoPuJqQtm3VBv1dMoYvixzyKUVw explicit
debug1: Will attempt key: /home/mctom/.ssh/id_rsa_passwordless.key RSA SHA256:fieUWhBcGSuGQmS2BoPuJqQtm3VBv1dMoYvixzyKUVw explicit
debug1: Will attempt key: /home/mctom/.ssh/id_rsa RSA SHA256:K1XUQzdCkNKkCKv/hokHO4hYevovSs5Og+C91a0mxtM explicit
debug1: Will attempt key: /home/mctom/.ssh/id_ed25519_passwordless ED25519 SHA256:ZlSR4obh6mv5Ft1zouF1LkFL2h8ehGgCAgou0zeSq4o explicit
debug2: pubkey_prepare: done
debug3: send packet: type 5
debug3: receive packet: type 7
debug1: SSH2_MSG_EXT_INFO received
debug1: kex_input_ext_info: server-sig-algs=<ssh-ed25519,sk-ssh-ed25519@openssh.com,ecdsa-sha2-nistp256,ecdsa-sha2-nistp384,ecdsa-sha2-nistp521,sk-ecdsa-sha2-nistp256@openssh.com,rsa-sha2-256,ssh-rsa>
debug3: receive packet: type 6
debug2: service_accept: ssh-userauth
debug1: SSH2_MSG_SERVICE_ACCEPT received
debug3: send packet: type 50






debug3: receive packet: type 51
debug1: Authentications that can continue: publickey
debug3: start over, passed a different list publickey
debug3: preferred publickey,keyboard-interactive,password
debug3: authmethod_lookup publickey
debug3: remaining preferred: keyboard-interactive,password
debug3: authmethod_is_enabled publickey
debug1: Next authentication method: publickey
debug1: Offering public key: /home/mctom/.ssh/id_rsa_passwordless.key RSA SHA256:fieUWhBcGSuGQmS2BoPuJqQtm3VBv1dMoYvixzyKUVw explicit
debug3: send packet: type 50
debug2: we sent a publickey packet, wait for reply
debug3: receive packet: type 60
debug1: Server accepts key: /home/mctom/.ssh/id_rsa_passwordless.key RSA SHA256:fieUWhBcGSuGQmS2BoPuJqQtm3VBv1dMoYvixzyKUVw explicit
debug3: sign_and_send_pubkey: using publickey with RSA SHA256:fieUWhBcGSuGQmS2BoPuJqQtm3VBv1dMoYvixzyKUVw
debug3: sign_and_send_pubkey: signing using rsa-sha2-256 SHA256:fieUWhBcGSuGQmS2BoPuJqQtm3VBv1dMoYvixzyKUVw
debug3: send packet: type 50
debug3: receive packet: type 52
Authenticated to virtualbox ([192.168.0.254]:11122) using "publickey".
debug1: channel 0: new [client-session]
debug3: ssh_session2_open: channel_new: 0
debug2: channel 0: send open
debug3: send packet: type 90
debug1: Entering interactive session.
debug1: pledge: filesystem
debug3: receive packet: type 91
debug2: channel_input_open_confirmation: channel 0: callback start
debug2: fd 3 setting TCP_NODELAY
debug3: set_sock_tos: set socket 3 IP_TOS 0x20
debug2: client_session2_setup: id 0
debug1: Sending command: whoami
debug2: channel 0: request exec confirm 1
debug3: send packet: type 98
debug2: channel_input_open_confirmation: channel 0: callback done
debug2: channel 0: open confirm rwindow 65536 rmax 32759
debug3: receive packet: type 99
debug2: channel_input_status_confirm: type 99 id 0
debug2: exec request accepted on channel 0










mctom
debug3: receive packet: type 96
debug2: channel 0: rcvd eof
debug2: channel 0: output open -> drain
debug2: channel 0: obuf empty
debug2: chan_shutdown_write: channel 0: (i0 o1 sock -1 wfd 5 efd 6 [write])
debug2: channel 0: output drain -> closed
debug3: receive packet: type 98
debug1: client_input_channel_req: channel 0 rtype exit-status reply 0
debug3: receive packet: type 97
debug2: channel 0: rcvd close
debug2: chan_shutdown_read: channel 0: (i0 o3 sock -1 wfd 4 efd 6 [write])
debug2: channel 0: input open -> closed
debug3: channel 0: will not send data after close
debug2: channel 0: almost dead
debug2: channel 0: gc: notify user
debug2: channel 0: gc: user detached
debug2: channel 0: send close
debug3: send packet: type 97
debug2: channel 0: is dead
debug2: channel 0: garbage collecting
debug1: channel 0: free: client-session, nchannels 1
debug3: channel 0: status: The following connections are open:
  #0 client-session (t4 r0 i3/0 o3/0 e[write]/0 fd -1/-1/6 sock -1 cc -1 io 0x00/0x00)

debug3: send packet: type 1
Transferred: sent 3864, received 2140 bytes, in 1.9 seconds
Bytes per second: sent 2012.8, received 1114.8
debug1: Exit status 0

real	0m4,831s
user	0m0,025s
sys	0m0,004s
Nevertheless I added some entries to /etc/hosts and the time was repeatedly under 600ms, so it does look like an improvement, but not conclusive.
Some other voices over the internet pointed out the possible troubles generating random numbers, but I can't see why would that happen only in the scenario when I chroot dropbear from another machine.
Punk ain't no religious cult, punk means thinking for yourself!

Maintainer of PiStackMon

User avatar
mctom
Posts: 2744
Joined: Wed Nov 11, 2020 4:44 am
languages_spoken: english, polish
ODROIDs: OGA, XU4, C2, M1, H3+, SP3, Vu8M
Location: Gdansk, Poland
Has thanked: 368 times
Been thanked: 481 times
Contact:

Re: True Love - Ultimat resource sharing over LAN

Post by mctom »

ptrace is a cool tool, never heard of that before. I managed to attach it to dropbear server to see what's going on. Here's the output from a similar "whoami" session that took 3.412s.
Can't tell what all this means yet...
2023-02-26-011339_1031x850_scrot.png
2023-02-26-011339_1031x850_scrot.png (97.29 KiB) Viewed 831 times
EDIT: A-ha!

Code: Select all

2810  connect(7, {sa_family=AF_UNIX, sun_path="/var/run/nscd/socket"}, 110) = -1 ESTALE (Stale file handle) <0.416715>
2810  close(7)                          = 0 <0.000014>
2810  socket(AF_UNIX, SOCK_STREAM|SOCK_CLOEXEC|SOCK_NONBLOCK, 0) = 7 <0.000012>
2810  connect(7, {sa_family=AF_UNIX, sun_path="/var/run/nscd/socket"}, 110) = -1 ESTALE (Stale file handle) <1.266053>
2810  close(7)                          = 0 <0.000018>
strace is amazing, why didn't I know about this!?

EDIT: Explicitly exporting /run as a nfs share solves the issue, now the whoami roundtrip is well below 100ms.
Gotta love dropbear, it works much faster than openssh server.
Punk ain't no religious cult, punk means thinking for yourself!

Maintainer of PiStackMon

User avatar
mad_ady
Posts: 11322
Joined: Wed Jul 15, 2015 5:00 pm
languages_spoken: english
ODROIDs: XU4 (HC1, HC2), C1+, C2, C4 (HC4), N1, N2, H2, Go, Go Advance, M1
Location: Bucharest, Romania
Has thanked: 647 times
Been thanked: 1081 times
Contact:

Re: True Love - Ultimat resource sharing over LAN

Post by mad_ady »

Hmm, I don't know if I'd export /dev and /run from the host... When you write garbage to /dev/null on the minion, you'll be limited by NFS throughput (give it a test!). Also, stuff from /dev/random will be read only from the host, ignoring the minion entropy and consuming it on the host - so expect delays waiting for entropy to build up if you have many minions.
/run should also be volatile, and I don't think you need to export it, but mount it as a tmpfs (or whatever type it is) after chroot, on the minion.

User avatar
rooted
Posts: 10037
Joined: Fri Dec 19, 2014 9:12 am
languages_spoken: english
Location: Gulf of Mexico, US
Has thanked: 788 times
Been thanked: 587 times
Contact:

Re: True Love - Ultimat resource sharing over LAN

Post by rooted »

To fix any entropy issue I would install haveged entropy daemon, we figured this was causing issues on Android around ten years ago and myself and some others implemented various methods to solve the problem.

My solution was use frandom since I'm a kernel guy and others used haveged.

User avatar
mctom
Posts: 2744
Joined: Wed Nov 11, 2020 4:44 am
languages_spoken: english, polish
ODROIDs: OGA, XU4, C2, M1, H3+, SP3, Vu8M
Location: Gdansk, Poland
Has thanked: 368 times
Been thanked: 481 times
Contact:

Re: True Love - Ultimat resource sharing over LAN

Post by mctom »

I don't think entropy will be an issue on the machines that's sole purpose is to get thrashed with computation tasks :)

Minions use their own /dev, for a few reasons including pseudo terminal allocation for SSH sessions, /dev/urandom, /dev/null and so on. There was no good reason to mount /dev from the workstation, most if not all calls would not work. And I think /dev is not exportable via nfs anyway.
/run mounted from the host solves the issue for dropbear, and possibly will solve it for other programs seeking data in /run, so for the sake of compatibility, can't see why not.


Right now, everything is mounted using the following commands.
busybox mount supports nfs, unlike the coreutils mount, hence the different calls.

Code: Select all

busybox mount -o nolock tomusiomat:/ /remote
mount --bind /dev /remote/dev
mount --bind /dev/pts /remote/dev/pts
The exports file on H3+ includes /, /tmp and /run.

@mad_ady, how do you propose to mount /run as tmpfs, and what are the foreseen benefits of that?
Punk ain't no religious cult, punk means thinking for yourself!

Maintainer of PiStackMon

User avatar
rooted
Posts: 10037
Joined: Fri Dec 19, 2014 9:12 am
languages_spoken: english
Location: Gulf of Mexico, US
Has thanked: 788 times
Been thanked: 587 times
Contact:

Re: True Love - Ultimat resource sharing over LAN

Post by rooted »

Check your entropy you may be surprised, should stay in the 3K range

Code: Select all

 cat /proc/sys/kernel/random/entropy_avail




User avatar
mctom
Posts: 2744
Joined: Wed Nov 11, 2020 4:44 am
languages_spoken: english, polish
ODROIDs: OGA, XU4, C2, M1, H3+, SP3, Vu8M
Location: Gdansk, Poland
Has thanked: 368 times
Been thanked: 481 times
Contact:

Re: True Love - Ultimat resource sharing over LAN

Post by mctom »

rooted wrote:
Mon Feb 27, 2023 2:51 am
Check your entropy you may be surprised, should stay in the 3K range

Code: Select all

 cat /proc/sys/kernel/random/entropy_avail
It's constantly at 256 on my Manjaro H3+ worstation, and at about 600 on my minions and slowly rising.

Eh, maybe I should go telnet instead. But SSH is such an elegant solution with regard to passing the user identity and so on. I wish it could be unencrypted on demand..
Punk ain't no religious cult, punk means thinking for yourself!

Maintainer of PiStackMon

User avatar
mctom
Posts: 2744
Joined: Wed Nov 11, 2020 4:44 am
languages_spoken: english, polish
ODROIDs: OGA, XU4, C2, M1, H3+, SP3, Vu8M
Location: Gdansk, Poland
Has thanked: 368 times
Been thanked: 481 times
Contact:

Re: True Love - Ultimat resource sharing over LAN

Post by mctom »

A dropbear make benchmark of the current setup:

H3+ for reference:

Code: Select all

-j16	real 0m13,238s user 0m37,389s sys 0m7,678s 
-j8	real 0m13,682s user 0m36,880s sys 0m7,258s
H3+ and 3 Minions, make executed locally, only gcc calls passed to Minions

Code: Select all

-j4	real 1m50,756s user 0m26,224s sys 0m6,799s 
-j8	real 0m50,420s user 0m26,255s sys 0m6,816s
-j12	real 0m34,910s user 0m28,036s sys 0m7,682s
-j16	real 0m26,492s user 0m29,486s sys 0m8,271s
-j24	real 0m21,726s user 0m31,766s sys 0m8,484s 
-j32	real 0m18,587s user 0m34,334s sys 0m9,040s
-j40	real 0m19,090s user 0m35,180s sys 0m9,183s
-j48	real 0m16,953s user 0m35,745s sys 0m9,282s

-j56	real 0m17,511s user 0m35,598s sys 0m8,903s
-j64	real 0m17,725s user 0m36,176s sys 0m9,273s
gcc commands were executed by H3+ if all minions were loaded with work, that's why user time rises with jobs param above minion core count (12).
This also shows that make itself utilizes a lot of resources for this build. It consists of hundreds of small C files, so it's not even an optimal problem for a small cluster of minions to begin with. It's very IO intensive, not CPU intensive.

I tried running a make job entirely on Minions cluster, alas it crashes at the very end, trying to run the linker. It complains that a random *.o file is missing (it's not). I guess it must be an nfs sync problem - linker wants a file that has been created milliseconds before by some other machine. I wonder if there's any way around that.

Also tried building ipxe and grub with poor results. ipxe was complaining about the input data when it launched an assembler. I guess my wrapper must have some issues with "transparency". Similarly grub builds failed because of the brackets in commands it issued. Wrapper needs some more work..

EDIT: A-ha!
https://serverfault.com/questions/11427 ... -on-server
Last edited by mctom on Mon Feb 27, 2023 5:40 pm, edited 1 time in total.
Punk ain't no religious cult, punk means thinking for yourself!

Maintainer of PiStackMon

User avatar
mad_ady
Posts: 11322
Joined: Wed Jul 15, 2015 5:00 pm
languages_spoken: english
ODROIDs: XU4 (HC1, HC2), C1+, C2, C4 (HC4), N1, N2, H2, Go, Go Advance, M1
Location: Bucharest, Romania
Has thanked: 647 times
Been thanked: 1081 times
Contact:

Minions - the affordable CPU farm

Post by mad_ady »

This topic is split from mctom's "True Love" thread

User avatar
mctom
Posts: 2744
Joined: Wed Nov 11, 2020 4:44 am
languages_spoken: english, polish
ODROIDs: OGA, XU4, C2, M1, H3+, SP3, Vu8M
Location: Gdansk, Poland
Has thanked: 368 times
Been thanked: 481 times
Contact:

Re: Minions - the affordable CPU farm

Post by mctom »

A quick update after short tests yesterday.. I invested a while to better understand the nfs mount options, and good Lord, there are export options AND mount options, which are not exactly the same. :D
So I disabled file/directory attributes caching in /home and /tmp, and enabled 24h file attributes cache on /usr (so essentially eternal cache). That also includes /lib, so basically most executables. It's a similar strategy to what outrun does.

The performance impact is not known yet because I broke Hypothalamus in the process.
BUT tasks run entirely on minions actually complete now, so one problem is solved.

Speaking of Hypothalamus, it doesn't work like intended. It has a non-zero reaction time (because CPU utilization is calculated as a change in statistics over some period). But also, when make launches "n" tasks, it does so almost instantly, so all calls to minion wrappers happen at the same time and there's no way of coordinating that.. So the initial task distribution was completely random.

I think I'll have to fall back to "slot" based distribution for now, one task per core. Now I'll have to invent a way of ensuring that. Might as well limit the number of SSH connections, lol
Punk ain't no religious cult, punk means thinking for yourself!

Maintainer of PiStackMon

User avatar
mad_ady
Posts: 11322
Joined: Wed Jul 15, 2015 5:00 pm
languages_spoken: english
ODROIDs: XU4 (HC1, HC2), C1+, C2, C4 (HC4), N1, N2, H2, Go, Go Advance, M1
Location: Bucharest, Romania
Has thanked: 647 times
Been thanked: 1081 times
Contact:

Re: Minions - the affordable CPU farm

Post by mad_ady »

Most likely this kind of process distribution already exists, but you need to know what to look for. Perhaps our good overlord GhatGPT, in its infinite wisdom knows more...

fvolk
Posts: 820
Joined: Sun Jun 05, 2016 11:04 pm
languages_spoken: english
ODROIDs: C4, H3
Has thanked: 0
Been thanked: 122 times
Contact:

Re: True Love - Ultimat resource sharing over LAN

Post by fvolk »

mctom wrote:
Mon Feb 20, 2023 8:53 pm
Well, the PXE boot and initial setup takes anything between 3 and 4 minutes. UEFI is to blame, so they say, the image download speed is awful.
PXE loading image from a TFTP?
TFTP is block-by-block protocol:
send block -> ack block -> send block -> ack block...
...and that's not fast

User avatar
mctom
Posts: 2744
Joined: Wed Nov 11, 2020 4:44 am
languages_spoken: english, polish
ODROIDs: OGA, XU4, C2, M1, H3+, SP3, Vu8M
Location: Gdansk, Poland
Has thanked: 368 times
Been thanked: 481 times
Contact:

Re: True Love - Ultimat resource sharing over LAN

Post by mctom »

fvolk wrote:
Thu Mar 02, 2023 9:00 pm
PXE loading image from a TFTP?
TFTP is block-by-block protocol:
send block -> ack block -> send block -> ack block...
...and that's not fast
That has been dealt with. Read on. :)
Now Minion image is hosted via nfs.
Punk ain't no religious cult, punk means thinking for yourself!

Maintainer of PiStackMon

User avatar
mctom
Posts: 2744
Joined: Wed Nov 11, 2020 4:44 am
languages_spoken: english, polish
ODROIDs: OGA, XU4, C2, M1, H3+, SP3, Vu8M
Location: Gdansk, Poland
Has thanked: 368 times
Been thanked: 481 times
Contact:

Re: Minions - the affordable CPU farm

Post by mctom »

Let me begin another handful of news with a funny anecdote.

At this point, I can do this: ssh minion4 -p 11122
I am logged into minion4, but interact with my H3+ filesystem (this includes my bash, settings and software). The only difference is a different hostname in bash.
I mixed up terminal windows and started editing my notes using vim on a minion, lol.
At some point I wanted to reset the minion machine to apply changes in init scripts, and I was left with a frozen vim, and a few lines of unsaved changes.
The terminal experience is so realistic that it's easy to mistake it for a real thing.

And now, introducing Mr Porter.
mad_ady wrote:
Tue Feb 28, 2023 7:06 pm
Most likely this kind of process distribution already exists, but you need to know what to look for.
(Picture of Till Lindemann singing "NEIN")
Noooo let's make a netcat and bash contraption! :lol:

So in the end I figured out a way of using busybox netcat to make a super duper simple "server" that accepts commands and responds to them.

Something like this:

Code: Select all

INPUT=$(cat -)

case "$INPUT" in
	"uptime")	busybox uptime ;;
	"poweroff")	echo "Till next time!"; poweroff ;;
	"slots") 	echo $(($(busybox nproc) - $(netstat -tn 2&> /dev/null | grep $IP:$SSH_PORT | wc -l))) ;;
	*) echo "What?"
esac
And this script is run on minions like so: busybox nc -lk -p 11123 -e /remote/etc/minions/init/porter.sh
And accepts command sent like this: echo uptime | busybox nc <ip> -p 11123

This works in the background, and accepts one-liner commands, and does stuff. Most importantly, it returns a number of "free slots" on the machine, which is computed as nproc minus active ssh connections.
minions wrapper will pick a random host, ask for "slots", and send a task over there if the return value is non-zero.
The response time is within 10ms range, so far this looks feasible.

Another revelation: drop hostnames from the scripts. Recently I looked into pihole statistics and the usual network traffic was dwarfed by minion's DNS queries. :D

Now updating minions wrapper to depend on porter instead of hypothalamus, a failed experiment indeed.
Punk ain't no religious cult, punk means thinking for yourself!

Maintainer of PiStackMon

User avatar
mad_ady
Posts: 11322
Joined: Wed Jul 15, 2015 5:00 pm
languages_spoken: english
ODROIDs: XU4 (HC1, HC2), C1+, C2, C4 (HC4), N1, N2, H2, Go, Go Advance, M1
Location: Bucharest, Romania
Has thanked: 647 times
Been thanked: 1081 times
Contact:

Re: Minions - the affordable CPU farm

Post by mad_ady »

which is computed as nproc minus active ssh connections.
Add a | grep ESTABLISHED to process only active connections (as opposed to closing connections in TIME_WAIT state). And add a bit of error handling in case the result is negative (more connections than cores).
These users thanked the author mad_ady for the post:
mctom (Fri Mar 03, 2023 3:38 pm)

User avatar
mctom
Posts: 2744
Joined: Wed Nov 11, 2020 4:44 am
languages_spoken: english, polish
ODROIDs: OGA, XU4, C2, M1, H3+, SP3, Vu8M
Location: Gdansk, Poland
Has thanked: 368 times
Been thanked: 481 times
Contact:

Re: Minions - the affordable CPU farm

Post by mctom »

Trying to combat a significant overhead of ssh authentication and establishing TCP connection to begin with, I was thinking.. Is there a way to reuse the existing SSH connection, rather than reconnect each time?
And then I remembered: Not only that, SSH also supports channels!

And so does OpenSSH client. It can be set up to work completely transparently. One connection is set up and persists, and is reused as needed, even for multiple sessions. All I had to do was to add this to ~/.ssh/config:

Code: Select all

Host 192.168.0.15?
	ControlMaster auto
	ControlPath ~/.ssh/sockets/ssh_mux_%h_%p_%r
	ControlPersist 600
Now instead of 300ms, each "pwd" issued over ssh takes about 60ms. A massive improvement!
But! I broke Mr Porter :lol: It returned free slots as a number of SSH connections vs nproc, now I always have exactly one connection, even when idle. And no idea how to count the tasks from the minion's perspective.

EDIT: With a complete random task distribution, while hypothalamus and porter are not usable at this point, the current make time is about 18s, but I've seen a record breaking ~16,5s, although I wasn't able to repeat that. The last 3-4 seconds are spent on building the main executables, which are being randomly assigned to minions, while the workstation is idling. So, breaking the local build time likely will never happen, for this particular test of dropbear build.
A smarter scheduler should bring yet another improvement.
Punk ain't no religious cult, punk means thinking for yourself!

Maintainer of PiStackMon

User avatar
mad_ady
Posts: 11322
Joined: Wed Jul 15, 2015 5:00 pm
languages_spoken: english
ODROIDs: XU4 (HC1, HC2), C1+, C2, C4 (HC4), N1, N2, H2, Go, Go Advance, M1
Location: Bucharest, Romania
Has thanked: 647 times
Been thanked: 1081 times
Contact:

Re: Minions - the affordable CPU farm

Post by mad_ady »

A nice improvement indeed!

Regarding calculating usage - does each session spawn its own shell and does authentication? If so, you should see active shells as lines in the output of w.

User avatar
mctom
Posts: 2744
Joined: Wed Nov 11, 2020 4:44 am
languages_spoken: english, polish
ODROIDs: OGA, XU4, C2, M1, H3+, SP3, Vu8M
Location: Gdansk, Poland
Has thanked: 368 times
Been thanked: 481 times
Contact:

Re: Minions - the affordable CPU farm

Post by mctom »

w command is not present in tinycore, and it seems it just grabs data from a few fs locations (where else).

Yes, I explicitly ask for a new pseudoterminal for each ssh connection, so I can just do that:
ls /dev/pts | wc -w

And it does seem to work in a simple manual test. Thanks. :)

That will not work for counting workstation slots, though..

I modified hypothalamus to write a CPU load to a shared memory location (/dev/shm/hypothalamus), as a four byte C string with null termination. In fact, null padding. I still have to trim the cat output, but hey, it works.

Now the problem identified before haunts me again.. Both grub and ipxe builds crash because they contain brackets or quotation marks, or both...

(I added newlines for readability)

Code: Select all

/usr/bin/gcc -E -DARCH=x86_64 -DPLATFORM=efi -DPLATFORM_efi -DSECUREBOOT=0 -fstrength-reduce 
-fomit-frame-pointer -falign-jumps=1 -falign-loops=1 -falign-functions=1 -m64 -mno-mmx -mno-sse -fshort-wchar 
-Ui386 -DNVALGRIND -fpie -mno-red-zone -fstack-protector-strong -mstack-protector-guard=global -Iinclude -I. 
-Iarch/x86/include -Iarch/x86_64/include -Os -g -ffreestanding -fcommon -Wall -W -Wformat-nonliteral -Wno-array-bounds 
-Wno-dangling-pointer -fno-dwarf2-cfi-asm -fno-exceptions -fno-unwind-tables -fno-asynchronous-unwind-tables 
-Wno-address -Wno-stringop-truncation -Wno-address-of-packed-member -fcf-protection=none -Werror -ffunction-sections 
-include include/compiler.h -DASM_TCHAR=@ -DASM_TCHAR_OPS=@ -Ulinux -DASSEMBLY -DVERSION="1.21.1+ (g6c033)" 
-DOBJECT=lkrnprefix arch/x86/prefix/lkrnprefix.S
This part seems to cause the trouble: -DVERSION="1.21.1+ (g6c033)"

gcc: error: (g6c033)": linker input file not found: Nie ma takiego pliku ani katalogu

This is passed to a minion like so:

Code: Select all

ENVS=$(env | sed 's/=/="/;s/$/"/' | tr '\n' ' ')
ssh -t -q $MINION -p 11122 "cd $PWD;$ENVS $@"
Calling it on the workstation from within the wrapper gives the same result. (I call it by simply $@).
A method used earlier was similar, but I wanted to skip bash if possible (doesn't change anything so far)

Code: Select all

ssh -t -q $MINION -p 11122 "/bin/bash -c 'cd $PWD;$ENVS $@'"
I tried escaping different characters (adding backslashes before brackets, plus sign or quotation marks) but that doesn't seem to help at all. I think the problem is a space character in -DVERSION, and I can't just blindly escape every space character. There must be a better way, surely...
Punk ain't no religious cult, punk means thinking for yourself!

Maintainer of PiStackMon

User avatar
mad_ady
Posts: 11322
Joined: Wed Jul 15, 2015 5:00 pm
languages_spoken: english
ODROIDs: XU4 (HC1, HC2), C1+, C2, C4 (HC4), N1, N2, H2, Go, Go Advance, M1
Location: Bucharest, Romania
Has thanked: 647 times
Been thanked: 1081 times
Contact:

Re: Minions - the affordable CPU farm

Post by mad_ady »

I use a similar approach to running a remote program over ssh with the help of a wrapper. I pass parameters that sometimes need quoting. I use a perl module (because I'm old school) to handle quoting for me: https://metacpan.org/pod/String::ShellQuote

Here's my wrapper, for reference. It has various symlinks pointing to it and uses the symlink name to call the corresponding remote program.

Code: Select all

#!/usr/bin/perl

use strict;
use warnings;
use String::ShellQuote;

#this script is just a wrapper used to call execution (with supplied parameters) to the equivalent program on remoteServer
#also redirects pipes

my $programName = $0;
my $parameters = shell_quote(@ARGV);

#print "DBG: Going to run remotely $0 $parameters\n";
#open(LOG, ">>", "/tmp/remoteProxy.log") or die $!;
#print LOG "$0 $parameters\n";

my $commandString = "ssh -i /path/to/id_rsa remoteUser\@remoteServer \"$programName $parameters\"";
#print "DBG: Going to run locally $commandString\n";
print `$commandString`;
See if it solves your issues. Not sure if it correctly escapes backticks,,,
These users thanked the author mad_ady for the post:
mctom (Tue Mar 07, 2023 7:42 am)

User avatar
mctom
Posts: 2744
Joined: Wed Nov 11, 2020 4:44 am
languages_spoken: english, polish
ODROIDs: OGA, XU4, C2, M1, H3+, SP3, Vu8M
Location: Gdansk, Poland
Has thanked: 368 times
Been thanked: 481 times
Contact:

Re: Minions - the affordable CPU farm

Post by mctom »

I've spent 5 hours trying to understand what the hell is going on, how quoting works, how many quotes and layers of escaping are needed, how many shells are even involved.. Is make spawning a shell? Does ssh? What if I call /bin/bash within ssh? And does gcc want to see those quotes or shall they be removed..

And so on.. Nothing made any sense, especially errors I got in return :D

To make matters worse, my case is really complex. Not only do I want to launch a command, but also adjust current directory and environmental variables before I do so. AND redirect pipes, hopefully that still works..
And also, keep in mind! How are the jobs delegated to minions in the first place?
What I did was creating /home/mctom/.local/bin/gcc which calls minion /usr/bin/gcc @$. So I can freely choose which commands are to be delegated to minions, and by the way, what makes it more universal than distcc. I'll call that gcc wrapper and minion wrapper.

@mad_ady, thanks for the script.. It wasn't of much use directly because my case defeated even that, but by installing prerequisites I discovered that the library you've been using exists as a standalone tool, that allows for more flexibility.

So I've made a setup that at least compiles ipxe exclusively on minions.. But now crashes when run locally with yet another peculiar error.
I mean, it does work if I ssh to localhost :lol:

gcc wrapper:

Code: Select all

#!/bin/bash
FULLPATH=/usr/bin/gcc
minion $(shell-quote $FULLPATH "$@")
minion wrapper (only the essentials):

Code: Select all

ENVS=$(env | sed 's/=/="/;s/$/"/' | tr '\n' ' ')
ssh -t -q $MINION -p 11122 "cd $PWD && $ENVS $@"

#This is how I ran the commands locally... Now it doesn't work anymore
$@
The problem is, I have to call shell-quote at the gcc wrapper stage, otherwise stuff passed to minion will be already stripped of the original quotes. But then, it's messed up and won't run locally.

I'm tired, gonna solve this tomorrow.

EDIT: eval $@ :)
Punk ain't no religious cult, punk means thinking for yourself!

Maintainer of PiStackMon

Post Reply

Return to “Projects”

Who is online

Users browsing this forum: No registered users and 1 guest