Pragmatic 7: A Category Called General

5 January, 2014


John and Ben discuss the challenge of digital storage, filing, tagging and categorization as well as associated workflows and technology.

Transcript available
This pragmatic weekly discussion show, depleting the practical application of technology exploring real-world trade-offs we look at how greedy is transformed into products and services that can generalise nothing is as simple as it seems that Alexander McCall hostage IgG 800 I am doing very well thank you and Heydon that Brewer the beautiful 0° morning and I'll has been a stinking hot horrible day here I have to unfortunately report so yeah opposite ends of the spectrum analogous prototypes in the world so you are hereby anyway bring this up during the spring afternoon are only that was a nurse I wish it was the case but anyway all right are we have a lot of thank you's arm this episodes also break down quickly into two pieces which is first of all our all the people on twitter that it had done shouts and to us about enjoying the show which I really appreciate and is arm so the saints are to arm our solid mail I can't actually find out who sold males name is Blitzer it's a company account so faced them are Daniel gnomes Andrew Escobar Matt bulk Brandon Gribben and your struggle with this one apologies in advance our Polish name are Rattus la my goodness arm appeared. Reserve ski BDO arm apologies if I mention in ways names they but thank you very much for that very much appreciated are also however just as importantly as some people written and very nice write articles that show some groups blog posts are saying different things they like about which I again really wholeheartedly appreciate and are so sustained she is to our Scots will see that he included include the list the show in his list of suggested broadcasts to listen to and warrants a bigger company there so while they give it out as a link in the notes for that as a causative flow and data to handle as a fellow H GI I'm not sure our wonderful mamas with a rather nice piece about show on their site again our link in the show notes are but also a very special thank you are to Marco Armands who said someone for things about show on his site are again is a link in the notes that are open following Marco's work so low better five years now when it's a paper first came out mid 2008 are through the wing is doing building the man was broadcast on the 5 x 5 also are of his work with the magazine and most recently a Celtic broadcast and that is soon to be released the much anticipated the overcast armour broadcast at these working on so it's fun for me is very different feeling to have someone that you followed and respected her so many years appreciate what you do so are again very special thanks are to Marco for listening enjoying the enjoyment show so thank you Michael is the reason I'm drinking arm strong at night disgustingly better copy this morning sell the show where are you CAC okay fair enough if I drink coffee I could reiterate that on Friday I could unfortunately I'm drinking water so anyhow are cool right so arm the topic today that I'd like to go through is something that we discussed on our previous shows Merced will get back to it which is what we store are digitally specifically digitally where how and why and a current of the exact details of why this came up but I want to break this down into two pieces in the first piece is filing of information and filing how I know ALS of our filing and exciting are the mites it is easily oddly it is one of those things I find fascinating because it's so hard to get it right and the the what we keep and and on what is assert will be the second piece so dive right in the first piece which is which is filing so when you filing anything of any kind you there's more than one way to skin says the skinny guardians go with alphabetical system or you're a numerical system which is generally arbitrary as you can go of the chronological system are if you like the date that something occurs just chronologically are in some cases you might have otherwise agreed our geographical location so this information relates to this geographical location so an engineering parlance you know maybe it's now this this particular site is up in the mountains in this location CG Paula Baxter filed together if in a more personal contact up to the budget videos I took them all at six flags magic mountain well might have them all over in the in one foal or subfolder whatever so young geographical potentially ends are obviously also a subject so you done Ted's birthday or the other great graduation day walking out what whatever you might have photos videos and say yell documents you building a new house and signs are also subject based phone systems however the problem with with filing systems of any kind is whether or not in a day and age where we can search everything so easily with indexing systems and and powerful computers were at a point where you can search the entire contents of your hard drive in not held a lot appear of time really and when you got search power like Google to search the Internet as well and is you if you apply that to a desktop scenario your spotlight and in OS X for example there is it's it's got to the point where it's almost become quicker to search that has been to file things away by bold by any system really add the funny thing is there is a study done by IBM and IBM research papers linked to this in the inner show no signs it's a bit dry but it's interesting and is what they did as I did a study of email re-finding so they had to add 364 was our people that part partook in media in a study and the concept was they want to see whether or not there was a time saving to be had to be actually measurable by filing your things in your email in your inbox into folders subfolders categories you would however you choose a pilot system not important exactly but some Connor filing organisation versus someone who was allowed to search and just dump it all in one folder and they found that it was marginally faster to search but when you included the amount of time that it took to file and organise in the first place search was the clear winner and once I read this paper read this paper a few years ago I read that IAI immediately threw my hands up in the end as a part of me that is been trained three years and years of of just habit I will put this in this photograph of my email inbox is broken down into different projects I've worked on my hard drive like when I saw thing my photos are a directory called photos videos are in videos and sizeable my breakdown videos by years and so on but the oval why do I do that if I've got something as fully text searchable like an email instead because search is now essentially so much quicker I know when I'm doing it what's the point because you know they've shown that it doesn't actually reduce that amount of time also comes the email but the other concept that that is interesting with filing is the method by which you break down and is the problem is every method has its drawbacks so if you go alphabetical that assumes you have knowledge of the content of the subject that all what is your looking for alphabetically so the problem of medical assay you're looking through photos each of the photos you've got a filename a filename is a description of what that photo is and if you say you know Ted's birthday party for birthday party at Ted's well if you don't know what letter it starts with the birth apparatus is as much much higher than Ted's birthday party which is guarantee closer bond of the alphabet amass NASA from alphabetical residence is arbitrary insofar as you have to know what the expression that you're searching for the letter begins with all that the item that the method by which it is been bitten in organised the problem with a numerical system obviously is it depends on your your recollection of when things happen because it in numerical system typically all your number things are based on biker example of a working I quote the quotes may come in in a certain order so as we quote 21.2223 and if you remember the actual what the number relates to and you remember that Bob and sometimes it's not chronological so I'd attend will come in after work on it and I will actually assign a number for a week or as one that came in technically afterwards got aside a number first visit was quicker and easier to do let's say so numerical systems sort of you you gotta be careful what sort of methodology used for dishing out the numbers because it can end up being about arbitrary our geographical is all well and good bird depending on how far how close you want to go down detailed so if you want to say against its geographically broken down so right now I've got a site on this site all this house is certain size yard okay that's great but I will look at all the photos were taken in this specific room is okay or geographically I didn't go down to that level of detail so to bad society still go through something manually and search manually just assay are chronologically again the idea remember what they things happened at the time we don't know what we are human beings we have terrible memory with respect to time in our brains compressed time and we perceive in different ways a survivor so my wife recently about a medical tests are that I had done a few years ago our site was about 56 years ago and she pulled out some of the forms and the x-rays and the date was two years ago and I was convinced it was five or six years ago and am IIIK okay that was a busy two years right lot happens and you tend to think it's been a lot longer than it really has been and the other is a problem with perception of time psychological systems are perfect either although everyone are better options so honestly the best ways to break down by subject and a lot of people do that now say while this has to do with invoicing this has to do with written accounts receivable this has to do with them whatever when I file something but the problem is no matter how you slice it eventually you will end up with a category called general or miscellaneous and an R is maddening because I blubbered wherever one went to work for around a company and I went well when there they said other jobs are all of this directory if you look at the information it'll be other miscellaneous that unlike O2 okay that's a great filing system because got there because when I worked in Canada I work with someone who is particularly are very very specific about her filing system and you did not mess with it you put the wrong thing in the wrong slot she got very upset so why you document control people mad down other people you don't cross anyway so in terms of a debt as well as actually recall the arm are ex-KCB comic I don't often lead to that by Josie reversible autonomous very really hilarious and so whole miscellaneous category thing able to that link to The xx KR's CD comic I'm not gonna try describe it because this is a is an audio shown on a visual Joe is so Google's adjusted soldiers are you okay so update you this you that's not to say they can look that's what else it that's pretty much what it looks like it here. There are so dad and I agree that you enter is in it inevitably no matter what system you choose there will be a limitation so you know that the best filing systems will have multiple methods by which you can find some information you're looking for that in the old the old days the other bad old days where everything was manually have a piece of paper of a physical folder a filing cabinet role filing cabinets in a room the oh obviously that was more difficult but now with everything being searchable do you really need to file it that way especially when you got multiple attributes are taken with with relation to a file so anyway will get that limit what interesting things I found what I was doing some of my research is Oslo is looking for links are specifically about the IBM research paper on the ones I came across in my searches was Powell Mac experts organise their files at the time of the article by Lex Friedman on Maxwell and I occasionally read MacWorld and is this particular article has our interviews want to is for people so got Casey list Federico BTG John Syracuse and Katie Floyd and is all of them hands different ways in which they organised their staff and their own essentially narrow filing systems need to dig legal cause for the information and they were quoting all sorts of helper applications like Quicksilver launched by Alfred and Darth Vader one was even using our hazel service is a file file renaming values as all light that actually buckle while a young weird candidate of love that the episodes was different shows analog floss eyedrops and things in a folder and things happen announcement advice for the that's nice our way that works for you see I'm I look at all those things and think well that's our escort of its kind in solving out with this it the organisation of individual files and tools in order to find them more easily there is it something that is a very personal thing to solve but I guess from going with this this this discussion is still think of it from a point of view of is the effort put in in that workflow saving you time and if you need to go do some advanced tools like hazel for example a small aperture and a power user category I where I was hoping to sort of look at more from the average user point of view and whether or not you should be bothering so Ray will all see what limitless stack up the cartilage site and see how it powered ends up so so bear with me hang in there are okay so is that before the great thing about digital storage is while quite frankly you have a time and date of accessed created modified so very far you have those attributes typically most file systems as well as obviously the file name of the file type so that's what information feed index search cross-reference by this this PO is quite a bit there so a lot of people will take silly attributes and replicate them and say okay well I'm gonna embed the file creation date or chronological like a chronological system and old without the beginning of the filename audio file name what's on some planet and what they will do then is they'll say okay well now I've got in the filename is easy to search maybe Otto Doyle sought more easily in a in a large directory listing is all well and good but I suppose the question is is no duplication if your time and date accessed created modified isn't enough to give you that Ray and I do realise you clone a hard drive sometimes there are some operations we can restore weakened mess that up and I guess that that is that as a matter is an issue I guess and by putting up the filename then you sort of guard against that butts are in any case if it was grouped by a subject that was more natural search by seller say all the audio clips from from the show would say are from pragmatic do you really need to know the date that they were done we just need to know the episode numbers right under the item numbers and I need the tumbling the timestamps arm sure but only the date of the matter died need an uneven and endemic chronological order relative to each other break down Nasca exactly and and is that sort of thinking is it that a lot of people would buy I've come across abuse hazel for a while I ate Federico's example and I'm aware of one other person so I say all the people that you hazel I know that's to now you use it that's three they go they sell self statistics is around this just like to say a statistically questionable excellently that there are you could say it's anecdote I feel like the anyhow there in joke but the point is that it that information is embedded in the file system information for that file so you don't have to re-duplicated if you don't want to say anyway are and that's why the wonderful things about digital that is so much harder when you've got an actual printed document so one of these are file types that I find interesting is that file types are kind of a new level of granularity against the right word because when you pick up a book you become a book with two hands you open it and you group hopefully open your eyes as you start reading your say okay I get that this is the text I know how to read English French Spanish German whatever language might might be written and hopefully know know how to read it and then you'll see okay as it is a picture it's pretty big John's charter whatever is there is nothing on the cover of the book that tells you explicitly that this book has a charter of this book has whatever you it's just it's a book your pickup open to interrogate what's in it and is the centre where did interrogate anyway you I mean is the right word for other now you yes it is a yes it is by underside is conjure up an image of someone like BSE across the desk from a book yelling at the desk a gallon of the book arm interrogating a Yamane grade programming sorry they how we will break you get there okay to book start talking to you than I'd suggest you should adjust your vacation but anyway arm this is an audiobook rights okay Olivia back on track every second ISO however with a widow file it will have a client that is associated with it so it's a Microsoft Word document unfortunately orator pages document are which you love my shop worker might say also unfortunately there irrespective you might have a PNG file which is obviously an image if you lay file formats or a GI a GIS file longer pronounce it because I can trouble a JPEG file are any of those other different union Australia okay Darren image and you know that before you open it I suppose a book is more analogous to a Word document because in a Word document it could be a mix it's a mix could be a mixture of video can be a mixture of images are as well as text so but then again at the same time the difficulty and are difficult in digital landscape is that there are some file formats that will not open in December because software but the point is more that the type tells you a bit more about what he is actually in it than an actual book award so again while getting at the point is the difference between physical world and the digital world so in a digital world you don't have to deal with the issues of a book you for a forum to understand what's in a book you gotta open the book sometimes be a synopsis of the front or index or a table of contents you open the booklet through it assumed you can see and even then that might be a good indication of it. What you're looking for a so realistically the file type is critical and very very very useful so you in terms of a filing system based on its file types that's that's quite appealing and something is not as easily possible with the physical pages so okay as I quickly talk then about cataloguing systems and you may think is a bit odd to talk about but the reason is that sometimes file systems so for example if I got spotlight on album of Michael got spotlight built-in, certain something that's great for any volume is currently connected to the computer and is spotlight index on on the drive and its edges of the spotlight index and it's all hunky-dory and relatively fast hopefully anyway are however if you do have a driver's not connected it won't find it so cataloguing systems that it maintains an index but doesn't have to be connected to the system so for example let's say you back a bunch of information on external hard drive or buy a bunch of CDs DVDs or Blu-ray discs well if you have cataloguing software you can actually use that cataloguing software to have a dedicated index or database of all the files you've got backed up somewhere and is a yes one of the players at the Weller's not bring it up is because when you are looking at how you organise things if you just take all the files you want to keep photos videos whatever you just waggle into a single directory and then burn out across a bunch of Blu-ray discs if you wanted or backup to a single external hard drive and put on a shelf as you go cataloguing that butts on your computer live you need to maintain index is no spotlight search for it because you got software with the index item so you become unseasonably socially catalogued and say okay now while looking for is on this drive and you may so while you want to do that I guess the point in the case of a hard drive is you really worry running our troubled times in my catalogue at only returned from the shelf your sitting cold on a running holler time as it was running spending an NCO driving is where our mechanisms or solids were so when you got CDs optical discs obviously when they're not in a drive you can't read them so cataloguing is an extension of the search capabilities for your data so the thing I find a lot of people do when they are doing that their filing systems is that they spent a lot of time building up was directory structure and then they'll put on a hard drive I put on is a virtual bunch and the discs that have no idea which is on so it Yamane SIP and these are looking at my portable hard drive Reynard thinking what's on what you and how you find it all okay so let's assume that because I do online backups to Blu-ray disc and I've got a lot of no flak for that one boats are and I'll expose my reasons will shortly but irrespective of your caller hard drive is simply not connected and powered on or a bunch of flash drives or it's a bunch of optical media irrespective of them the actual physical device restoring your information on as a backup if you are backing up or cloning whatever you're doing you need to know what is on what device is harder device works hard drive it as a worker optical media economy so you have 10 external hard drives that are sitting there coldly backups of the last 67 years of the data on them how you know which one is not what and if you don't catalogue it how you know that you can go through a search to manually and you know what I was insane for a welfare sales in St Lucia. My life much less changed but anyway it so he is something I used to do ice to go through and and I would go through 100 or thereabouts are CDs DVDs and a few Blu-ray discs in the early days and until I found I needed and it was it was crazy and then I got into a cataloguing cyst system and i.e. on on the Maccabees now find there is multiple examples out there and on windows of your Windows user are why I play with briefly was our agentive us RCD as freeware one I think from memory this quite a few different ones out there are not necessarily recommendations do your own research of course but gait is not much to them you put you pop the drive-in point to the driver CD and say go backups or EI car club pro catalogue and the better ones will even give you a snapshot so for example your budget but was unable take a very low resolution in a snapshot of the image is like 4 kB whatever you can't actually get exactly so you can look at and search for images in and have a flip through them so that's all I want and all you have to do is give it a name that you cross-reference so I've done the simple thing and simply had an arbitrary new Mellie I could call numerical filing system if you want for the actual Blu-ray discs but instead it goes A-Z and then AA is NZ so that all that fits into a single like hundred and 20 slot CD case essentially I've had for the last cheese counter 13 years and DO one case, stop the are you distant general general backup. But anyway I'm voting too much on doing the suggestion is that a cataloguing system is very helpful because and don't bury things and iso is the a lot of a lot of a lot of geeks will say well only do a superduper of clone or a carbon copy client of my hard drive that's fine you know that Jesus if you do that that's restoring purposes that are not actually backing up your data so you and and this is the other things in a very is not very convenient to trader and tried it that wealth knowledge is terrible because because all happen is you end up with this PO 80 gig hundred and 50 gig image file depending on size computer and insight and then you'll end up with two or three of them in your psycho was looking for this file I don't know which one is on and let's say I've ever updated or reinstall the operating system in the meantime summertime machine backups are the continuities is it could potentially be broken there if that's assuming running a Time Machine backup mind you he doing in a carbon copy cloning or there are superduper cloning in the obviously I would expect yours a bit of time she backups but I guess my point is that if you bury files and iso and the new backup the eye so you got my visibility what's in it ramp so you noted this on the be aware of so I guess that's the filing systems those wanted to touch on the first part of this discussion and iso horary started working on the next bit which is exactly what we keep where so obviously around and I sprinkle in this discussion talking about business.professional staff and that is gonna be highly variable based on what you do so for example a mile and a workers a lot of drawings there is a lot of contract documents is a lot of our there's a lot of yeah what's different documents of different kinds and obviously that's gonna have some unique requirements of a spirit that are specific to why do I still imagine someone before I quote #or even job numbers whatever and that's not what I want or balanced for a personal stop because the others what Moses can be most useful to people and one of things I'm trying to do in this is because we will walk away from it with some ideas about things I can have a gap but they could try they might well apply find useful and do something with and this is one of those areas where I think that the road to think about a bit more monitor thing in I would when working with other people who working in a company arm pains are a big project you need think that in other searching bridges failing goes out the window a little bit because you are not looking to buy a bus arm and is attending applies when you're done with family stuff there were broken to get into but you file price does love this really is approaching these are if you try to apply the same sort of thinking consideration that you were in her in on a corporate setting rate ends inside the enterprise are you gonna have a hard time with your own stuff absolutely in that the rules that are set out in larger corporations might experience a very rigid you gotta do what they say and I say put on the strivers our file structure and that's it and when you've got the whole departments I've worked at Cummings overhaul department to cyclical document control and and if you set a foot wrong date you you will know they will find you and they will hear they will they will do things to you will not forget the smile that the design sharply ran only we had to be really really and exceedingly strict with people about retiring inertness at the time you allot some pretty wacky we are now failing systems and an ending structure and stomach that but it's in the wrong thing are the primary means thousand mom many many thousand dollar mistake hand where the other tenet that control the document controllers is vital in it was the most important thing is the brakes on the car absolutely so I guess if anything the message I want to get across from the first half of this discussion is filing is overrated I'm sorry but it just it is for digital information and I say that I mean from most digital information between a standard file name a file creation date and especially now with photos where there being geo-tag a lot of those BG attain our people team of smartphones and GPS is built-in and even if there within 100 m and 200 m sorry you three 400 feet of your location it's good enough to know you're a Disneyland grab you know it's good enough you know and yet archived geographical is audit the information is automatically built in for you you don't need added you need to file it's just it's there so what are you what you filing a full and there's lots of software out there that can read information and organise things for you are not extol the virtues of iPhoto specifically because it's become somewhat of a lumbering dinosaur but are or even amateur which is also seen to be at this of the neglected data neglected stepson or something of a full but never mind the point is that there is software out there that the gentleman job photos but irrespective of that the point is modifying filenames and putting emails into far into subfolders and improving things together and tagging thing since the latest thing in Mavericks there honestly is it really worth the effort based on everything that I've seen everything I've experienced and that the study that IBM did I have to think that searches simply quicker and easier than if you had a cataloguing system for your off-line stored data then you've got both you got everything you could need and go that as I guess is that that's sort of how I see it anyway however that doesn't address the issue of what earth you do with all this data because the problem is working on a summers data all the time now were taking so much video and is not just lowrise video analysis 1080p might my iPhone has bit iPhones been taking 1080p video now for a couple years I think and is you any before at 720p these file sizes are not insignificant hard drive on my cycle heart of the SSD on my MacBook Air is your 56 gig and is you and that it's it's got a combination of of of music and movies and TV shows and and home videos and photos and was so much stuff on there is just personal stuff the memories of the kids in the open places we've been on holidays and some bass the majority in terms of size these days that you will find on any individual's computer is not is not documents every year you'll have some official documents to me I can relate to the bank for some reason or maybe got some receipts because all the ascending soft copies of receipts has become quite popular these days and even bank statements up to a point are some of you can download CW bank statements and say them in soft copier give under lock and key but you may be your family budget but all of that takes up hardly any space at all when you think about it really only a few hundred kilobytes year on average those files probably if the PDS maybes able to hear and that one photo at full residue on an iPhone with them with a mixed background of JPEG compression you look at five make for one photo so that's where the actual problem is in terms of the focus of storage and the thing is in this is the with the cliché of of people house on fire you run out of the house the fire department is there an asking way running back in your going back and get your bank statements not your computer are going to get your photo albums and so we have this were feeling it really by acutely right now with no other young data in just four months old and the amount video we've produced in the past four months and photos and is insane arm I am and I think we drove a few smart phones out of people get on a forestry and so are by using behaviour changes so much and the stuffers email actually get it now rate simply Gilligan was concerned over our their iPhoto libraries are any of the survivor you just instantly cannot understand okay and go to work the system after this absolutely and and while ago walk through are the different possible solutions now before I even begin I'm not here to talk about Time Machine versus carbon copy clonal superduper versus an off-site storage system like back blaze or crash land I'm not torment any of that I'm talking about the media that you would choose locally to back things up on and this is and I'll tell you why first of all I think about discussions been well and truly covered previously either broadcast and the ultimate solution seems to be to pay money to go is someone off-site off-site backups are the over from a professional point of view that's exactly what I recommend my clients do is you you off-site backup because you got geographical separation is an incident at one location it's unlikely the only other location physically will be affected and it makes sense to have more than one our ultimate physical location for supercritical information and and you and that's pretty standard stuff but what amount people that they don't have let let let's assume for instance that they not geeks that haven't got the BIOS of our like me in a stuck radius of one minor rim are in in in the sticks: quote arm amount not really but unfortunately from a telecoms perspective and sticks so you know not everyone can use an off-site storage system with ultimately an automated backup you just this is not available and is can be a very long time until it is available and even so that has an ongoing cost $of ongoing costs down on a course of beta people would say well you know I have I have lots of valuable data and silent and is you what is actually ways of handling that without doing the off-site storage some of them are more or less safe than others but in any case so I know what it's really tall but I will talk about quickly and frankly most people just do it locally that's one of this as discussion is aimed at is people that are quite "normal maybe then on our business arm I know I would like to think that I'm not the only person out here that is a gig that doesn't have off-site backup but maybe I'm wrong I guess will see so if you going to store it locally your options are essentially 33 broad types of technology you have magnetic storage you have optical storage and you have R/storage and will discreetly talk a little bit about you know which are the pros and cons of them so magnetic broadly falls into two, categories you've got the essentially what is I would call a flexible media and you've also got the non-flexible rigid media so rigid media being a hard drive social media being any kind of tape or floppy disk flows either ever floppy disks and is fast funny show why my kids are floppy to sell at a slower 5 1/4 inch disk in the house lesson one is that while they might my son piped up and said I'll I've seen a picture of that on some software we use the school in our regular toolbar button that is it and I can't now know disease is funny is only designed to good its maker proudly sold houses and a kitschy three representations of icons and no one has any idea what they are any more and is terribly sad were showing our age but anyway they are irrespective of that coin is that flexible arm flexion or magnetic media has all sorts of problems amid the media itself can be physically damaged by being you having too much tension apply to a can stretch our there is doesn't like our humidity it doesn't like temperatures can delaminate is in the magnetic up of the particles can actually partially delaminate from the surface of the of the of the plastic of the tape it's yellow it's just dodgy out because the obvious one which is you pass a strong magnet nearby and and it wipes all the data flips all the bits of its close enough and strong enough well are the other one of course is that so eventually magnetic tapes will be polarised and they'll lose their data as naturally they will and is the thing is with most magnetic media benefits flexible you looking about 10 to 20 years for their our lifespan and is a good link yard is good Wikipedia entry documents on the shadow so please feel free to read that Argo is a lot more depth than I am the sort of skimming over the key points so having said that 10 to 20 years you can get professional grade backup tapes that are guaranteed for 35 years in this clinic on obvious that such a market exists because you got people with the massive servers and they want to do we are tape backups and then you people pay lots of money well people paid money to do all the backups and everything and change the tapes and put them in fireproof safes in different locations physically outside the building because they don't want that sensitive information being transmitted over the Internet to go crash final back blaze or whatever that could potentially be intercepted and decrypted so go the goal fashion tape and physical media has has a big role in in that application but the average person is not gonna go and buy tape backup system that is not and if you look at our house floppy disk some income a final chorus floppy was star was what 1.44 Magnum one of our Lord two point not panic whatever unannounced file phone calls one point to make was the high density and 720 K was the air so it was terrible this is the only diecast or anything in their dead so throw them out the window you really only got those those tapes so you have not necessarily a good idea for the average person the ocular buyout thousands of dollars worth of professional grade backup takes just last 35 years for backups of photos so a lot of people these days are looking to hard drives and because hard drives come down a lot in price and they're saying while you were hard drives are hundreds a good note is achieved and try hard drives as well there magnetic media creates it's then that is no different to a dill backup tape except of course that the tapered that the disputing player itself the hazard store the actual information is protected from the outside world that exists in our ego isolated atmosphere sealed away from dust and so on and it's gonna be far more reliable right you would think that's the problem with hard drives is that notwithstanding the failure mechanisms so for example if it's turned off sitting on a shelf you may think well you know is not spinning it's not gonna have any of those wearer mechanisms is awake if they were mechanisms at psycho the reliability lingo right it just means that if I'm blind driver spinning 100 million rotations and it's gotta drive bearing lab very moderately rated to half that and statistically you go beyond that the further you go beyond that lost as will likely allow a failure so you know you safer not spinning the eye I plug 100 in July backups to I disconnected and I set up on the shelf is now no longer wrote a metal that where a mechanism is no longer imply and that's true however there are other wearer mechanisms that are still in play and that is these things got power supplies and they got the point of use power supplies are the overregulated ships of God are different capacitors in their firms are voltage regulation and smoothing advisor stuff and those individual devices have all got a limited lifespan you can't just leave it on the shelf it will eventually a lot of capacitors will dry out and you will get the whole thermal expansion and contraction will still happen unless of course Barbara is being stored in a temperature controlled environment and most people don't have that in their house so you'll be sitting on a bookshelf let's say for maybe if you're smart enough fireproof safe let's say you are still going to fluctuate it still and I have you still have differential expansion rates between the other components on this on the circuit board as well as the circuit board itself and even the casing of the hard drive and all those things eventually you'll get issues with our you'll have bad connections and eventually it will just die and is can't avoid that they all know technology lasts forever it just doesn't and now that those of Mr wearer mechanisms to let let's just ignore them for the moment persisting I'll assume that we leave on the shelf and editing or hunky-dory how long does the actual information last and when I was doing some research into this are a while ago I came across an article on the of all things are not rated and see it was it was quoting someone who is doing the physical are doing so during the time of the physics behind the hard drives and they were aiming for basically what he called a 10 year bit life which is to say after 10 years the integrity of a single bit cannot be guaranteed because statistically a comic or a cosmic ray might hit and obviously of a comic over a cosmic ray hits it off with a bit amongst a few others in the local vicinity most likely depending I guess and eventually inevitably the magnetic domains will start to break down and you will study a data corruption so that's what they're aiming for so even without wearer mechanisms let's assume they achieve that maybe maybe they go better than MMA makes 15 years before we start getting significant data corruption and obviously you if you if you've done multiple bales multiple hard drives and statistically you may be more money but are covered either no that irrespective the point is that you're not gonna get huge amounts of guaranteed life out of these things even if they're sitting: a shelf mummy it relative to magnetic media the idea of using hard drives long-term backup storage is actually a relatively new idea because previously hard drive is a lot more expensive and tape drives were cheaper for the for the same amount of storage controller store go 200 TB it was a lot cheaper to get you magnetic tapes to do that lots of the tapes than it was to do it to a bunch of hard drives that's a change in the last 10 maybe 578 years are so relatively recently assessments magnetic are optical which is one i.e. using is not perfect and the thing to appreciate about optical media for those that don't realise is that this two ways you can make optical media you can press it physically you can in other words the gully is that that the mountains and the flats of the ones and zeros in your optical disc can be pressed into it physically in which case you know it's far more solid reliable long-term proposition delicate glass master are actually at the machine now exactly at that literally there pressed and is now to be a master and you will basically get the material are that the plastic will be slightly warm your your precedent certain amount of force and Leah Z to cool for a few seconds Ashley's fractions of a second I would think and then you would pop spot property owner optical media off and off you go you're done the point is that that's far more reliable than if you burn a disc yourself and for doing backups you'll gonna bring a discus of Lockerbie creating a master and pressing 100,000 copies of the thing you're going to once in the nose ones are a completely different kind of technology and that requires a high-powered laser to literally are burn a hole in a layer of dye and the problem with that is Eleanor there's more than one way of doing as will point out that irrespective the strength of the laser and the age of the drive affects health activities and actually doing that and certain drives tend to have more issues with alignments are enduring than that others suffer example of why Bernadette DVD or Blu-ray disc on one drive sometimes it can have problems reading it on other drives so it's not exactly perfect and the other problem is of course that exposure to light our intense light and heat can cause a breakdown of that about layer that your right information to optically so most optical media out there generally are distinct average stuff off the shelf you look at 10 to 25 years some of them will advertise much of the hundred years but whether or not you trust that or not I'd yard is debatable if you really paranoid about backing up data on physical media I would trust that but something interesting that they're working on cynical and wife and the concept is to have a 1000 year lifespan for optical media and this is something that's it is 25 gig ship are blues single layer I believe blue with Blu-ray and that was so released about six months ago rather pricey our data price in front of me but obviously you get what you pay for but the whole idea is it supposed to be tested and proven to last two tender over 2000 years and the funny thing though about those officers under control conditions you like control temperature in your bio being away from someone's on the problem with that is obviously is is how do you qualify this how you actually prove this while that's the problem with them around a thousand years that Australia is not there is no minor just lie about that for the problem of God's Excelerator life testing is what they call it when you try and rate predicts and the way they do that is they run it through thermal cycling they they run it through our visual like for example optical disc I'm not entirely sure but we did ILT on electronics all the time and also… What she did as part of your our robustness are testing for reliability one you will stress the living crap out of Paul card such that when they're out in the real world and they go through the eye 365 days of your thermal cycling from are a cycling shift of say 50 60°F are in an average day that's maximin and back again day in day out for extreme locations well you know that will if you replicate that in lab and you do those same cycling's and you do one of those cemetery cycles in the space of 10 minutes ramp but value can estimate a ring and have spinning platter and in our economy having as overseas recruitment will jump off any kind of platform well before a thousand years absolutely right so what are the only obsolete point there is no yard EE and Eminem assuming gas there would be some way to get back up but it is dubiously being made to the archaeologist to dig up the remnants of civilisation what you exactly right and and even if they didn't dig it up would they have working functioning equipment that was capable of reading and other file formats in your yet exactly emulating the everything UEA is that what good is an optical disc if you have a working computer because even if you can spin up to get the data off how the hell do you know a file system as so it is so many poles and Abbott yes and each of those parties are here yesterday when I can get into Syracuse in discussion about HFS plus are that he's done that I think things are well and well and truly done so 35 years for a map of the professional grade backup tapes you could even argue that to stretch so the companies are selling that would need to continue to support that with current drivers for current operating systems wives if they go out of business so the whole concept of having a really long lifespan on these things is somewhat misleading and I went through an exercise recently and was recently in the last 12 months where I took all of my our CDs all of my farm 3 1/2 inch I saw 300 floppy disks earlier and I went through and I imaged all of them as I had a USB floppy drive and I missed all of them and I took them all and combined them onto and compacted them down to a bunch of our 25 gig Blu-ray discs so I have everything that I've ever had the last year she is some of it will be getting close to 20 years are all on a bunch of Blu-ray discs just in case you need it when do I realistic I realistically only software for an operating system that doesn't exist anymore but I guess maybe they won't want to mess around your nostalgic and emulate around have a fiddle with Windows 95 again I know maybe I did a Windows 98 star VMware fusion virtual machine but I couldn't get the audio to work properly can you run is only in JavaScript your but now anyway okay all your bygones because were almost at the end so what's this one of done is I've I found I found a great website it's about CDs okay but it's in the shadows and is a sign dye cortex lab and made.they got a nice little discussion over optical discs and life life expectancy arm longevity and so on and it's really I think it's really good it's it's not super Mario the part of the evidence anywhere since it is quite easy to read and understood that weight is not it's not too doesn't have to much required prior knowledge sites it's a nice one subject allowed if you want to read more about the last one that occurred to me as I was doing all this in preparing the notes for the first for this episode is/and the reason that I was I thought our flashes that flash is now where hard drives were a decade ago is reaching the point where you can now use a flash drive as a backup medium for small qualities of data you can't justify if you got a terabyte to backup the terabyte of SSD is gonna cost you a hell of a lot of money but you could do it and if you sat on the shelf it's gonna have a completely different set of wearing makers because you have to worry about the Sibling dry arm drying out really like that it's it's a it's a silicon based solution so the USB. Were you compliant USB port and a sky firearm no problem DO we assume USB ore around another 20 years of course but we are maybe not I'm sure that exist so the problem though with flash is that for quite a struggle to find very much information on the actual estimated storage lifespan if it sitting on a shelf lots and lots of information about the number of rewrites first figure SLC MLC or EML CEO is always a different technologies and when I do have a show about SSD's knowledge as these being created equally at some point in the future but for the moment at least arm will say about discussions and then the moment at least I just want to mention the fact that it's all about read/write cycles the majority of people I think I go I'm using SSD for as replacement hard drive on my laptop on my desktop whatever but yet is also removable thumb drives me over and as it does amaze those you thumb thumb drives our memory sticks memory DO flash drives USB drives usually strives jump drive derives side I get jump drive either me what you do jump on it anyway whatever something up your computer is Jane Eyre well so I guess the point is that those who are slower speed generally are less as well as some papermaker Patriot ones are meant to be high-speed USB three are AI again that I can show but it what you have essentially is a system that you think you put I shelf it's got no moving parts has minimal non-silicon nitrites the wearing makers of the relatively arm should be relatively benign so what have you got you've got potentially perfect contractual one thing I did find what I was looking into this is essentially the charge in one of the memory cells will leak out over time it's it's just a matter of how long and that's why couldn't see find out so if has the information I be very curious but at the moment store cost prohibitive thing the suggestion is of course that you keep it cooler because a high-temperature so accelerate the leakage rate so fiercely flash drive and echolocation throughout gonna give you a much better result than Edenic argue same is true of hard drives and optical drives optical discs sorry so at the end of all that as I see it optical media is the better choice that said it there's not much in it between that and hard drives for local backup storage because frankly are assuming you have an obsolescence problem which I'm assuming ago we don't end up having at some point in 1520 years time presumably both your hard drive any optical discs will have lasted most of that distance at which point then you can simply go to the next big thing the next optical format the next nearby that point may be reading will be SSD's and spending plans will be gone can say it's our there I was was just doublethink and is retirement as it seems like were actually in the middle of an obsolescence point right now with the transition to devices and in particularly on iOS with this in inability to get to the file system if FRA in ICU you've been using a Mac is a town guy iOS only neighbour new set of problems in your that that is not is not seamless and not is not quite clear to me how you would even make use of a lot of these these systems arm is really known no great way to do it without a PC what's true and obviously this whole discussions focused on on the PC situation and files and filing an ice icing on iOS and is go to it to a lesser extent but it's still true of on android if you read the Nexus seven tablet DO our I suppose some some of the android tablet you cannot removable media but the problem is then we go what would you do a removable media CSA take a year and ST card out of alimentary table will be a then I guess a flash drive sitting on the shelf for a backup I guess it scares me a little bit because not not all services last forever and and when services die not all them die gracefully to people that rely on them to store all their personal information and don't do some, backup worse local law not but do some collar back either failing to some, backup and I think they're clinically insane and if you've got photos that you cannot get back and no one in your not printing the people just don't print the photos much anymore they really don't care what you gonna do jail LSA Apple goes out of business tomorrow and you okay this is a parallel dimension here extremely wants what a lawyer but Mr seeable screws of the web services because that's a non-alien iOS are statistically more likely that they would have catastrophic failure and let's say it's not everyone gets screwed let's say it's 1% which is still a hefty number of people that would be shafted and all your photo stream and always gone and somehow that propagates out all those people's devices and all the data is gone somehow but it is a big yellow wave but I maybe it's not made maybe you got your you take a huge number of photos of the still important to you the back of a photo stream but then a long-running camera role for whatever reason to delete them to make room for new photo very very likely is arm in someone is people that are technical and grown-up listeners shop but you know people who are cool just got their first iPhone and other taking times of photos and irony was really totally sure what's going on with photo stream and now are not convinced now I arm and it's a big transition and you yes and in many ways that the mile of the devices is better and easier used by people aren't used to only do I do think about it and the actual wouldn't surprise me if someone then I'll has gone a year or two without backing up an online help from my brother at the Apple Store all the time people coming and saying he was my stuff because didn't do any backups in the last there are stuff that was deeply personal stuff that happens in the deal requires big catastrophic failure blast presents true are just don't have iCloud backup turned on dino photo stream turned on for whatever reason and you drop your phone a toilet and it's gone so I guess Mike Mike my issue here is saying that for the support of the subset of people that are I wonder what the Venn diagram is of people that they are creating lots of personal content by doing so is solely on iOS device with out any other companion Mac or PC in their lives I do wonder what that Venn diagram would look like because I suspect a lot of people that are in that scenario are not as prolific it's barley but is changing a thank you exchange and as it is our there is a generational aspect to it and is now out there to hear about a thing I think is pretty safe that people are but your rate is as you start obstruct bruise or why you're going to run into the really good side of the other relatively low amount of storage space on iOS devices you can run into that limit enemy be forced to start thinking about it arm me as I guess you known Mike my issue isn't not necessarily that it is the reliance on an external system putting so much faith in Apple and whether it's justified or not isn't the point where they're reliable or not isn't the point the point is that companies fail things die there beyond your control however if you had a a macro PC locally you backup your photos to a bunch of Blu-ray discs kept them a fireproof safe in the house which is what I do by the way and you update that savoury 12 months if I was a firewall guesthouse I'd leave a fireproof safe I grab my CD collection one hand I rip the Time Machine out the other and that's it now you are probably grab the kids to so you know and I his priorities dinner and how the point is that it is one hand to hand done at either if I was really know I could briefly memoir the fact that it's no big deal it did both very small and relatively light loads time captures a bit heavy but still so you know it's like between the two I got everything I want everything I need is anything I that that isn't burnt to a disc is on is as a Time Machine backup down so unless my strategy however if people and I think you will need to consider the strategy maybe that's the final point to wrap up on is if you are just operating on iOS device and you not think think about because if you're not in sunny does go wrong and it is not a question of UK you can't treat I don't think you should treat life as an if something happens you should be planning for when some brain goes wrong because if you're not, you can be lucky other people had you what I call a charmed life it where not much seems to have gone wrong but honestly that's the exception that what generally seems to happen is computers fail people drop their phones in the toilet or Jesus said again but I mean I'm sorry it was a phone it was a pager and no it wasn't mine anyhow they remind the point is it survives but if you find it would owe my wife had our orange juice spilt on her iPhone 3GS switches are hand-me-down from me because I'd upgrade your floor as at that point and it was some EA was one of those things it was unrecoverable fortunately the OIG had good backups because while BACs made me, I'm not that sort of thing I guess that's in any case because they think about and I would not focus so much on the organising I would focus more on the story and having some kind of plan and if it's not locally few that's fine if you want to use the OSI crash fell back glazed and you've got a massive pipe going into your house data pipe than great good for you and your mind on the ongoing costs and go for it recommend that sure but for a lot of other people is not the answer and offence you then strongly consider an external hard drive to backup to catalogue it all optical discs like I've done and keep them safe money think the arm below that this came to the fore for a lot of people are people I know went with the Mavericks update that you allow us arm and knew a few people had really serious issues and in our last must data that everything on and think from me a big part of the equation is is as yet it is think starting starving the mindset of thinking about it like some sort of single catastrophic of that happening and more are recognising that now just entropy is is going to do its work and you're going to have little catastrophic events all the time and do and yet you want to know nothing about the thousand year CD rate and to think about what what's gonna get me to my next application collect reviews arm that you you need to come and go through maybe a yearly basis or nine and US do a certain a certain round up every year and do a different round up every month into a different round of every week and cut about this are a series of stages that your data goes through arm because I think that there is the other side of this whole thing is that there is a search aspect helping us into defining old stuff on your time a video document IDEO in the dark about four dollars in August on that area if if regenerating just a huge amount of stuff all the time in responding time and money arm and in the inner resources and then the form of extra drives are just a stuff keeping track of stuff we don't need or want our does not radiator and all the things Alec I like in a lever the idea. I can about you know if were ever system is so complicated or so or so with the role that were deceiving everything our owner and of that ideal either was a good point I doesn't really cover and history people draw the line at a different place but that the truth is that when you look at storage this these days and examples I was giving a guess that the one comes to mind the most as is video frame and a mouse that you'll sit there and you'll take 100 hours with the video of Elizabeth PO rolled around and smiling (keep gurgling noises might still be able newborn go build up stuff with anything to yourself okay well keep all 100 hours almost in a snip out the bits where they can't do what you do have the hiccups of the last five minutes I think I got the gist of our cute values and I won't of assault keep the the funniest rose 30 seconds to a minute or something only give five minutes and you so that this whole idea of a lot raw footage edit the footage and get a final product you say the final progress of the raw footage and an archive cleaners and that's how I handle my videos is using the same technique you describe which is I read IRI I cumulated for a year I go through and I do a cut for the whole year and I keep the good stuff and I okay got enough of this and I won't keep that and burn the final results and destroying the raw footage near and that kind of concept applied across all of your information is a good way of thinking about what you should back up and and when it is decide you really need to keep it if you want more about this you can find John on Twitter at John Geagea to the seam on our partner should check I can't say tax distortion do come back to send an email you can send it to giant attack distortion do come and that Alexander and you can reach me on Twitter at Felix you can follow our pragmatic show on Twitter to seashell announcements and other related materials and we wanted to take mama just talk about this for a second arm would not track very much about how this show does not want one thing we really do better to do is start the discussion and feedback it's how we know are doing good job arm absolutely and what we have some really good feedback and are always would would like some more so with it without with every episode if you have just a spare moment then you would love any feedback you like to give us arm and if that includes a review or rating iTunes as well as fine art or just part of her considering the discussion are of something we talked about the show now consider this a big Daleks and there's a lot of technical details and there is a lot of educators arm and things that in quite nicely is difficult for everyone to know the whole story here so it's insight it's a rocket gonna do is nosed and other people are being attention and give this gives a better idea of what we should be focusing on absolutely about the educators is not possible to cover everything our energy on a lot of these topics in one show so obviously if there are educators that Casey was to explore more more than happy to look into this legislation�
Duration 1 hour, 13 minutes and 38 seconds Direct Download

Show Notes

Related Links:

Premium supporters have access to ad-free, early released episodes with a full back-catalogues of previous episodes


Ben Alexander

Ben Alexander

Ben created and runs and Fiat Lux

John Chidgey

John Chidgey

John is an Electrical, Instrumentation and Control Systems Engineer, software developer, podcaster, vocal actor and runs TechDistortion and the Engineered Network. John is a Chartered Professional Engineer in both Electrical Engineering and Information, Telecommunications and Electronics Engineering (ITEE) and a semi-regular conference speaker.

John has produced and appeared on many podcasts including Pragmatic and Causality and is available for hire for Vocal Acting or advertising. He has experience and interest in HMI Design, Alarm Management, Cyber-security and Root Cause Analysis.

You can find him on the Fediverse and on Twitter.