Cleaning and validating input data

The HR data source, that I currently receive person data from, has historically had data quality issues. These are much better than they were in the past, but still cause a few issues.

When I attended FIM training at OCG, I raised the issue of data cleanliness and was told in simple terms – make sure the input data is clean! If only life was so simple…..

Back to reality, I have had to add code to my Advanced Flows to deal with, clean up and validate the input data.

A nice example follows – importing Surname from HR – dealing with:

  • Just plain bad data (null as a string/ value)
  • Validation (characters that should not be present – via regex replace)
  • Clean up (removing spaces from around hyphens – double barrelled names).- there is also a bit of trimming to remove and spaces before or after the string value
  • Surname missing!

Things like this remind me of why “Codeless Provisioning” was something I fought to get working (for too long), but ultimately had to abandon in favour of using code for almost everything. Doing so has been a real panacea for all of the rules and other funnies that I have had to accommodate.

Note: I made a little edit – I was not checking for the presence of AccountName before raising errors – should that attribute have been missing (highly unlikely, but not unknown to occur), that would have raised an error in itself. The edited code is a little more robust!