code logs -> 2020 -> Thu, 14 May 2020< code.20200513.log - code.20200515.log >
--- Log opened Thu May 14 00:00:12 2020
02:04
<@himi>
Having worked extensively on OpenStack I can say with confidence that amorphous clouds of interlocking madness aren't restricted to any language or platform . . .
04:25 Degi [Degi@Nightstar-cj4otk.dyn.telefonica.de] has quit [Ping timeout: 121 seconds]
04:26 catalyst [catalyst@Nightstar-ckcorc.dab.02.net] has joined #code
04:27 catalyst_ [catalyst@Nightstar-2m8p1m.dab.02.net] has quit [Ping timeout: 121 seconds]
04:30 Degi [Degi@Nightstar-hnn55p.dyn.telefonica.de] has joined #code
04:51 Vorntastic [uid293981@Nightstar-h2b233.irccloud.com] has joined #code
04:51 mode/#code [+qo Vorntastic Vorntastic] by ChanServ
05:40 celticminstrel [celticminst@Nightstar-nuu42v.dsl.bell.ca] has quit [[NS] Quit: And lo! The computer falls into a deep sleep, to awake again some other day!]
06:56 mac [macdjord@Nightstar-rslo4b.mc.videotron.ca] has joined #code
06:56 mode/#code [+o mac] by ChanServ
06:58 macdjord [macdjord@Nightstar-rslo4b.mc.videotron.ca] has quit [Ping timeout: 121 seconds]
08:42 McMartin [mcmartin@Nightstar-c25omi.ca.comcast.net] has quit [Connection closed]
08:42 McMartin [mcmartin@Nightstar-c25omi.ca.comcast.net] has joined #code
08:42 mode/#code [+ao McMartin McMartin] by ChanServ
10:51 Emmy [Emmy@Nightstar-9p7hb1.direct-adsl.nl] has joined #code
10:55 catalyst_ [catalyst@Nightstar-eervhn.dab.02.net] has joined #code
10:57 catalyst [catalyst@Nightstar-ckcorc.dab.02.net] has quit [Ping timeout: 121 seconds]
12:41
<@gnolam>
Well that was a stroke of lucj.
12:41
<@gnolam>
*luck
12:43
<@gnolam>
Been struggling with an issue that's been happening - and only happening - in a literal production environment, that seems to be tied to a specific combination of hardware, firmware and driver.
12:45
<@gnolam>
And it's intermittent and random. It might be happening a few times a day for them, or they can go an entire day without seeing it.
12:45
<@TheWatcher>
oooogh
12:47
<@TheWatcher>
I'm guessing you've isolated it, though?
12:48
<@gnolam>
And what with lockdowns and whatnot I can't just fly down there and have a look in person, and these are half-tonne machines, so even despite the whole "then they would have *no* production going, instead of just production with annoyances", it would be kind of impractical for them to send one up to me.
12:49
<@TheWatcher>
Yeah, most postal systems complain about you sending half-tonne machines, even if you do have the right postage on them.
12:49
<@TheWatcher>
So I'm told.
12:49
<@gnolam>
After a near bricking of the test rig I finally just got it up to some approximation of their firmware and driver setup.
12:50
<@gnolam>
And I had what I believe was the cause of the problem manifest on the very first try after reflashing the firmware!
12:51
<@gnolam>
Haven't been able to reproduce it since, of course. But hopefully I shouldn't have to.
12:51
<@TheWatcher>
Huzzah!
12:58
<@gnolam>
(Turns out that an operation that previously couldn't fail - reading a property representing a fairly wonky sensor value (I have to have workarounds for when it occasionally craps out) - can now suddenly block and throw an exasperated timeout exception when it gives up. And an unhandled exception can ruin anyone's day.)
12:59
<@gnolam>
(I'm assuming the whole thing is them trying to fix the wonkiness on their own.)
13:11
<&Reiver>
gnolam: We have a script that runs at work that checks, first and foremost, if a previous instance of the script is already running. Scripts cannot see each other, so instead they have a global variable.
13:11
<&Reiver>
It sits at either IDLE or RUNNING (or HALT, but I don't tell anyone about that one).
13:12
<&Reiver>
Whenever the script fires up, it checks the variable. Does it say IDLE? Cool, set it to RUNNING, do your thing. When you're done, set it to IDLE again.
13:12
<&Reiver>
There's a catch. The most common way for the script to take too long is by crashing out, never setting the variable back on the way out, and no, there's no exception handling to do it for me.
13:13
<&Reiver>
So the script has a second branch: Does it say you're already RUNNING? Okay. Change the value back to IDLE and go back to sleep, and see what the /next/ script trigger sees~
13:13
<~Vorntastic>
A Singleton Application Is Its Own DOS
13:14
<&Reiver>
I feel like this capitalisation deserves further explication.
13:19
<~Vorntastic>
https://devblogs.microsoft.com/oldnewthing/20060620-13/?p=30813
14:05
<&Reiver>
ha, yes!
14:05
<&Reiver>
In my case not a huge issue, but yes indeed
14:07 celticminstrel [celticminst@Nightstar-nuu42v.dsl.bell.ca] has joined #code
14:07 mode/#code [+o celticminstrel] by ChanServ
14:07
<@TheWatcher>
Reiver: there's no way for your scripts to check process IDs?
14:56
<@gnolam>
Heh, got a congratulatory e-mail from the hardware vendor.
15:18 Pinkhair [user1@Nightstar-g7hdo5.dyn.optonline.net] has joined #code
15:20 Pink [user1@Nightstar-g7hdo5.dyn.optonline.net] has quit [Ping timeout: 121 seconds]
16:17
<@gnolam>
Meanwhile, with another hardware vendor, it seems that the API document is just a pile of lies.
16:18
<@gnolam>
Literally copied and pasted the example JSON, and it doesn't work.
16:18
<@gnolam>
For some commands. Only some. Not all.
16:18
<@gnolam>
Instead, you need a new field! That's completely undocumented!
16:25
<@TheWatcher>
Obviously, duh!
17:28 catalyst_ [catalyst@Nightstar-eervhn.dab.02.net] has quit [Ping timeout: 121 seconds]
17:36 catalyst [catalyst@Nightstar-8tnapu.dab.02.net] has joined #code
17:41 Vorntastic [uid293981@Nightstar-h2b233.irccloud.com] has quit [[NS] Quit: Connection closed for inactivity]
17:41 mac [macdjord@Nightstar-rslo4b.mc.videotron.ca] has quit [Connection closed]
17:42 mac [macdjord@Nightstar-rslo4b.mc.videotron.ca] has joined #code
17:42 mode/#code [+o mac] by ChanServ
19:36 Vorntastic [uid293981@Nightstar-h2b233.irccloud.com] has joined #code
19:36 mode/#code [+qo Vorntastic Vorntastic] by ChanServ
19:37 Vornicus [Vorn@ServerAdministrator.Nightstar.Net] has joined #code
19:37 mode/#code [+qo Vornicus Vornicus] by ChanServ
20:08
<&ToxicFrog>
New name for `wget -krp`: "website taxidermy"
20:08
<&ToxicFrog>
Reiver: in addition to the aforementioned problem, this also has a race condition
20:10
<&ToxicFrog>
Traditionally one solves the first problem by putting the pid of the running process inside the file, so if it says it's running but it can't be found under that pid, you know it's crashed (and can also then do crash recovery or the like)
20:10
<&ToxicFrog>
And the second problem by using something like `flock`
20:49
<@abudhabi>
ToxicFrog: What's website taxidermy?
21:11
<&ToxicFrog>
abudhabi: someone was asking elsenet if it was possible to "taxidermy a dead website"
21:11
<&ToxicFrog>
`wget -krp` (with some other options) is used to download an entire website and convert it for offline viewing, which, IMO, counts.
21:49
<&Reiver>
There is no way for my scripts to check process IDs, no. There is also no way to check if something is running (well ok technically there is but it is unmaintainable)
21:49
<&Reiver>
There is in fact only one blessing: The scripts are timed, and have a timeout.
21:50
<&Reiver>
I do not control the latter, but if the script takes more than ten minutes it gets cancelled by the enterprise job controller.
21:51
<&Reiver>
So I run my stuff every five minutes, and it's why the system's system check in "If you're running when I wake up, I'ma set you to not running and try again in five minutes", thus ensuring proper script spacing to rely on the job controller to kill it. I hope.
21:52
<&Reiver>
Sometimes it might take legitimately more than ten minutes; if that's the case I just end up with half-completed filesets... but this is why I have that madcap baton-passing system instead of anything saner.
21:53
<&Reiver>
The work already done sits there, gets slurped down, processed, updated tokens passed back to my system. This then means the scripts can tear rapidly through the 'no more work needed on those bits' files, and then chomp on the second tranche. Or third. Or whatever.
21:54
<&Reiver>
(In fairness it's never needed to take more than three stabs at a set of updates to complete; far more common is it overwhelms the internal memory management under such circumstances and so you end up with a trio of runs of 'everything up to huge file' (where it then errors out, leaving Huge File incomplete) -> 'Huge File' (after which it errors out on the next one somewhere) -> 'Everything else'.
21:55
<&Reiver>
Good times. Horrible, eldritch, morale-sapping... good times. *twitch*)
21:58
<@TheWatcher>
... yeah
22:04
<&Reiver>
People look at my horrible baton system and wonder why I did it instead of, say, an ethernet-style keep-on-spamming protocol or the like
22:05
<&Reiver>
They do not quite catch on just /how many/ ridiculous rube-goldberg style constraints I was under!
22:23
<@gnolam>
If there's something I have learned in this job it's that it's legacy systems all the way down.
22:23
<@gnolam>
Nowadays I don't even question it. I just assume that there are proper hysterical raisins for why something is the way it is and just roll with it.
22:24
<@gnolam>
And keep in mind, in the previous job I was /emulating punch cards/.
22:25
<@gnolam>
We still have customers on our DOS software.
22:42
<&Reiver>
gnolam: Ah, you would fit right in then :D
23:25 Emmy [Emmy@Nightstar-9p7hb1.direct-adsl.nl] has quit [Ping timeout: 121 seconds]
23:48 VirusJTG [VirusJTG@Nightstar-42s.jso.104.208.IP] has quit [Connection closed]
23:48 VirusJTG [VirusJTG@Nightstar-42s.jso.104.208.IP] has joined #code
23:48 mode/#code [+ao VirusJTG VirusJTG] by ChanServ
23:56 catalyst_ [catalyst@Nightstar-1dili4.dab.02.net] has joined #code
23:57 catalyst [catalyst@Nightstar-8tnapu.dab.02.net] has quit [Ping timeout: 121 seconds]
--- Log closed Fri May 15 00:00:13 2020
code logs -> 2020 -> Thu, 14 May 2020< code.20200513.log - code.20200515.log >

[ Latest log file ]