Archive for the Systems Engineering Category

Server Crippled by Updates Again

miqrogroove
2020-02-14T11:23:34-05:00

February update cycle again sent my server into a reboot loop, shutting down all services until I could diagnose the problem on site.

Following the same steps as in my previous post, I switched the boot choice to Safe Mode, and observed another boot failure. This time instead of getting into the weeds of troubleshooting the update system with a second Safe Mode boot, I decided to let the server go back to the normal boot mode, because some other websites have reported this as a good solution.

In this case, the failed Safe Mode boot followed by no other action did successfully restore the server.

After reviewing the Event Viewer logs, I could only find a repeated Event ID 1074, “TrustedInstaller.exe has initiated the restart”. KB2992611 and KB890830 both installed successfully before the loop, then KB4502496, KB2822241, and KB4537814 installed after the loop.

My current recommendation is to disable automatic updates for Windows servers and only perform update checks while on site. Also, run the update check twice in a row. The servicing stack update from December didn’t show up until after recovering from the reboot loop and then checking again for more updates.

14 Feb 2020

Category:
Systems Engineering

Tags:
,

Discuss:
Comments Go Here

Reboot Loop After KB4525246 Update

miqrogroove
2019-11-21T00:41:03-05:00

Several other sites confirmed recent server failures after running Windows Updates. Here are the basic steps I used to recover.

Attach a keyboard and enter BIOS setup. Make sure Quick Boot is disabled.

Press F8 while restarting the server to open the Advanced Boot Options menu.

I tried Safe Mode, but did not see a successful boot there.

Next I tried Repair Your Computer, which brought me to the “Choose an option” screen.

Select Troubleshoot, then select Command Prompt. Follow the instructions to log in as one of the administrators.

Read the rest of this entry »
21 Nov 2019

Category:
Systems Engineering

Tags:
,

Discuss:
Comments Go Here

Solving Disconnected Folders Over Wi-Fi

miqrogroove
2018-08-17T13:10:11-05:00

Avoid the First Default Setting for Share Caching

I’m starting to realize that the Offline Files feature in Windows causes more problems than it solves when it comes to unreliable network connections.

In 2016, I described how to minimize the effects of an occasionally high ping when the slow-link mode goes into effect: Offline Files Stay Disconnected

But that doesn’t solve the problem.  Fine-tuning or even disabling the slow-link mode forces the Client Side Cache (CSC) to use its “Action on server disconnect” configuration any time the network isn’t performing perfectly.  The default behavior, “Work offline”, treats each affected (meaning cache-enabled) share as being totally unavailable and then the CSC attempts to retrieve cached copies.  This happens even if the server is still available but failed a single ping check.

Why is this still a problem?  Well, in practice, most files don’t need to be available offline.  By default, the Windows file server is configured, and the Windows client is designed to allow each user to select individual files as “Always available offline” from the file context menu.  When a user selects this option, that one file is copied to the CSC, and in theory that one file is always available.  This allows for targeted use and minimal sync time.  The problem arises with all the other files.  When the CSC goes offline and marks the shared folder as disconnected, it effectively blocks access to all the files that were never cached, even if the server and its files are still available.

At this point, you and I now understand the situation that needs to be avoided.  We don’t want to have a large number of files under the unnecessary clutches of the CSC, regardless of network quality.

Update 08/17/2018

At first, I thought the solution was to change the file server’s default configuration of allowing users to decide which files are cached.  I changed folders that needed maximum online availability to be set to “No files or programs from the shared folder are available offline.”  This server setting automatically disables the CSC.

Unfortunately, the result was that the folders configured for offline caching worked great, but the folders configured for no offline caching only worked until some network error or server reboot.  In this configuration, once a path became disconnected, an Offline Files message is logged in the Event Viewer, and even though no files are being cached the entire path becomes unavailable.  At that point, the workstation persistently throws Error 0x80070035 any time that particular path is accessed, until the workstation is rebooted.

The only solution I’ve found that works now is to completely disable the Offline Files feature on the workstation.  With Offline Files disabled from the Control Panel, the network and server errors are now transient and I am not having any problems with disconnected paths or persistent errors.

Offline Files is ultimately broken and does not improve the Windows experience.

3 Aug 2018

Category:
Systems Engineering

Tags:

Discuss:
Comments Go Here

Windows 2012 Can’t Ping NVR Host

miqrogroove
2018-07-26T23:23:13-05:00

I just resolved a long-term problem where one specific Windows 2012 server was unable to ping one specific device on the same LAN.

There were no relevant resources or similar-looking cases on the web.  Everything else on this LAN worked normally.  The server could ping all other clients, and the clients could ping the server and the NVR.  I just could not get the server to ping the NVR for the life of me.

I suspected at one point that this was a routing issue due to my desire for strong security policies around IOT devices.  This turned out not to be the case as I could find nothing wrong with the router or any routing tables.

At last, I decided this problem was so specific that it could be a bug in the NVR itself.  In this case, the only thing special about the Windows server from the NVR’s perspective was that the server was providing both DHCP and DNS to the NVR.  I tried disabling each service, and found exactly what I was looking for.

The NVR will not respond to pings from its DNS server.

I don’t know why this is broken and don’t really care to investigate any further.  The workarounds are either:

  • Create a DHCP reservation with its own option to specify a 3rd-party DNS server, OR
  • Disable the NVR’s DHCP client and set a static address with an alternative DNS server address value.

In my case, the NVR does not need to use the local DNS server, so this is an easy fix.  So long as my server’s IP address is not used in the NVR DNS configuration, everything works normally and the server can ping the NVR.

26 Jul 2018

Category:
Systems Engineering

Tags:

Discuss:
Comments Go Here

High Resource Use by Start Screen

miqrogroove
2018-04-11T09:59:44-05:00

While diagnosing what I thought was a Windows Update failure, I discovered unrelated massive resource consumption and file scanning activity apparently tied to the Start screen in Windows 2012.

Symptoms:

10 to 20% constant CPU usage by Windows Explorer.

Rapid file scanning or Shared Folder usage in the case of folder redirection.

Triggers:

Resource consumption begins immediately after opening the Start screen and performing a keyboard search.

Closing the Start screen does not help.

Workarounds:

Sign out the current user.  This action will shut down Windows Explorer, preventing the unwanted symptoms until triggered again by a user.

11 Apr 2018

Category:
Systems Engineering

Tags:

Discuss:
Comments Go Here

Android Studio Setup Error

miqrogroove
2017-06-27T11:29:39-05:00

While installing Android Studio for the first time, I encountered the message below.

The following SDK component was not installed: Google Repository

Simply clicking the “Retry” button allowed the installation to continue successfully.

Relevant log dump quoted below.

java.nio.file.AccessDeniedException: C:\Users\{user}\AppData\Local\Android\Sdk\extras\google\m2repository.backup\com\google\android\support\wearable\1.0.0\wearable-1.0.0-javadoc.jar -> C:\Users\{user}\AppData\Local\Android\Sdk\extras\google\m2repository\com\google\android\support\wearable\1.0.0\wearable-1.0.0-javadoc.jar
Warning: Failed to move original content of C:\Users\{user}\AppData\Local\Android\Sdk\extras\google\m2repository back into place! It should be available at C:\Users\{user}\AppData\Local\Android\Sdk\extras\google\m2repository.backup
java.io.IOException: Failed to move away or delete existing target file: C:\Users\{user}\AppData\Local\Android\Sdk\extras\google\m2repository
Move it away manually and try again.
Warning: An error occurred during installation: Failed to move away or delete existing target file: C:\Users\{user}\AppData\Local\Android\Sdk\extras\google\m2repository
Move it away manually and try again..
Warning: Observed package id 'extras;google;m2repository' in inconsistent location 'C:\Users\{user}\AppData\Local\Android\Sdk\extras\google\m2repository.backup' (Expected 'C:\Users\{user}\AppData\Local\Android\Sdk\extras\google\m2repository')

Preparing "Install Google Repository (revision: 54)".
Found existing prepared package.
"Install Google Repository (revision: 54)" ready.
Finishing "Install Google Repository (revision: 54)"
Installing Google Repository in C:\Users\{user}\AppData\Local\Android\Sdk\extras\google\m2repository
"Install Google Repository (revision: 54)" complete.
"Install Google Repository (revision: 54)" finished.
27 Jun 2017

Category:
Systems Engineering

Discuss:
Comments Go Here

ownCloud Using Wrong PHP Configuration

miqrogroove
2018-10-20T14:36:36-05:00

The ownCloud community dropped support for Windows Server, so I must resort to documenting such problems here instead of contributing open source.

One major symptom that confirmed ownCloud was using more than one PHP environment on my server was the presence of session handler files in more than one directory.  Specifically, I was finding orphaned files in C:\WINDOWS\Temp even though my one and only php.ini production file specified a different path as well as garbage collection.

I traced the session file generation as far as the ownCloud calendar “app”, which lives in owncloud\apps\calendar\appinfo\remote.php and related places.

Debugging results were fascinating in that not only was the wrong configuration file loaded, after dumping all phpinfo() to disk I also found that the calendar app was running under an entirely different version of PHP.

The culprit:  After the most recent PHP upgrade, my site-specific Handler Mappings ended up with mismatched verb restrictions.  Somehow the new version ended up restricted to GET,HEAD,POST by default, while the old version remained unrestricted.  Although my handlers were in the correct order to give all *.php files to the correct module, any time a CalDAV client sent a PROPFIND or similar request, IIS essentially downgraded to the unrestricted version of PHP.

The solution:  Remove verb restrictions for the ownCloud site’s Handler Mappings, and then remove all but one of the PHP Handler Mappings to prevent any other versions from running without throwing errors.

If you get a bogus error about spaces in “the path to the script processor” when updating verb restrictions, just add double quotes around the path, and then click “No” on the ensuing bogus error about needing to create a new FastCGI application.  (facepalm)

19 Dec 2016

Category:
Systems Engineering

Tags:
,

Discuss:
Comments Go Here

Offline Files Access Denied over VPN

miqrogroove
2016-07-01T17:16:27-05:00

I just tried taking a Windows 10 laptop on the road for the first time.  Everything was great until I tried the VPN for the first time.  Suddenly, I was getting Access Denied errors, and “You do not have permissions” errors for all files made available offline.  I confirmed the VPN tunnel and even browsed to other shared folders on the same server.  The offline files errors persisted after dropping the VPN.

When I returned to the domain Wi Fi, file synchronization completed normally and there were no errors at all.

Am I to believe that Windows 10 is completely incompatible with VPN synchronization?  I never had a problem with this on Windows XP, and I am dreading the months of research and experimentation normally involved in fixing this kind of Microsoft failure.

Read the rest of this entry »

12 Jun 2016

Category:
Systems Engineering

Tags:
,

Discuss:
Comments Go Here

Offline Files Stay Disconnected Over Wi-Fi

miqrogroove
2016-03-15T14:15:29-05:00

Configure Slow Link ModeEver since my move to a new apartment, I was frustrated by some of my network files going offline randomly and staying offline for 5 minutes or up to an hour or two.  The weirdest part was that it would only happen to the network files that were in a path with Offline Files enabled.  As a result, I would periodically lose access to files that were not marked “Always available offline”, and I would get frequent synchronization conflicts for any files that were still available offline.

Another symptom of this problem was that I could map a separate drive letter to the same or deeper path, not enable Offline Files for the networked drive, and then have no trouble with the files when I try to use the drive letter.  I could even browse shared folders using the server’s UNC path at the same time as my Offline Files cache seemed to be stuck offline.

I had several suspicions about why this was happening.  First of all, I had started using Wi-Fi networking on my desktop computer as a convenience until I could knock some holes in the apartment walls to run proper Ethernet cables.  The signal quality seemed good enough that I shouldn’t have persistent connection problems, yet the Offline Files system seemed central to the problem.  I eventually discovered that the Client Side Cache “slow link” mode was at fault for this whole mess.

Read the rest of this entry »

15 Mar 2016

Category:
Systems Engineering

Tags:
, ,

Discuss:
Comments Go Here

Shortcode Problems: WordPress 4.4

miqrogroove
2015-08-31T11:07:48-05:00

I will briefly summarize Shortcode API changes since WordPress 4.0 and then kick off some ideas for a roadmap.

The first major accomplishment was the expansion of the API documentation, including a new large section I wrote about the formal syntax for shortcode input.

I also put forward a robust parser concept for the function wptexturize() that promised to re-introduce the ability to use unrestricted HTML code inside of shortcodes and shortcode attributes.  That concept went through many, many changes before being introduced in v4.2.3.  After consulting with the WordPress security team, and after extensive testing of the shortcode parsing functions, we determined that the shortcodes-first parsing strategy was fundamentally flawed and could not be included with any version beyond v4.2.2.  This is why I added an HTML parser to the Shortcode API and ultimately curtailed the use of shortcodes inside HTML rather than expand the use of HTML inside shortcodes.

Read the rest of this entry »

22 Aug 2015

Category:
Systems Engineering

Tags:

Discuss:
3 Comments