If you're scooping a data warehouse effort, spend the majority of your time understanding the source environment. That's where the uncertainty lies. Source systems that are on older file technology, unsupported by modern ETL tools, and for which there is no documentation and internal talent available will be much more difficult to manage than their DBMS counterparts for which there is abundant, up-to-date documentation and lots of available internal skills for.
All is not lost, however, for those troublesome source systems. What is necessary is a process of analysis that uncovers not only the fields contained therein and the accurate population of those fields, but also, and equally importantly, characteristics about the system's ability to deliver data. This means the systems' activity levels and any concurrency issues related to real-time extraction. Or, if batch extraction is sought, the system's batch windows need to be determined. Once all of this is known, the extract process can be determined.
Unfortunately, with older systems, it is often all that can be done to achieve a "drop" of the source data into an area that is specially built for data warehouse purposes and from which the transformation part of ETL can begin.
Doing source system analysis of older source systems is not the most glamorous work. Many who do it must learn technologies that have seen better days and systems that will probably be going away in the next few years. However, it is a very necessary part of data warehousing and the area that gives people the most trouble if not attended to properly.
For more information check out searchCRM's Best Web Links on Data Warehousing/General Information.