Tuesday, November 1, 2016

A Data Management Platform for Supporting Multi-Stakeholder Partnerships in Citizen Science

A recent publication by the General Administration Office (GAO) (Open Innovation Practice to Engage Citizens and Effectively Implement Federal Initiatives) sends an important message to Citizen Science (CS) organizations: public agencies will soon be knocking at your door, seeking to partner up with your organization, motivated by the need to achieve greater impact by enlisting CS enthusiasm, knowledge and resources.

Such partnering opportunities face significant challenges, driven by differences inorganizational cultures, both technical (e.g., reaching agreements on protocols) and administrative (e.g., project scheduling and reporting). Assuming that a common vision and an action plan are in place would leave the protagonists with worrying about the tools that would make the partnership work. Since the most valuable outcome of the partnership is data (and the knowledge that comes with it), it is obvious that a system for managing data is key for success.  To address this challenge, we at myObservatory offer a solution that combines the myObservatory Environmental Information Management System (EIMS) with an Enterprise Manager (EM), or in short myO/EM.

The vision that guided us at myObservatory (myObservatory.org) in developing myO/EM led to this outcome:  
(1) myO/EM allows the partners to pursue a well-defined set of common-core, coordinated activities, and at the same time, and in addition, allow each of the partners to pursue their particular agenda.  
(2) myO/EM maintains the highest professional standards in all aspects of the data acquisition process, including field data acquisition, documentation and quality assurance, reporting, integrity of the data base and compliance with data management requirements set by relevant public agencies.

myObservatory is documented extensively at myobservatory.org. So I will focus here more on the EM component. myO/EM is designed to face the challenges of a multi-stakeholder collaboration by supporting common-core activities while at the same time enabling each of the partners to pursue their own programs. Using the EM component, the group coordinator can manage projects and members; disseminate instructional material and guidelines; assimilate authorized data from the common core areas and analyze it using statistical and graphical tools; set standards for all common activities starting from field methods all the way to data labeling; create and share forms; monitor compliance of partners with set project guidelines; devise and implement quality assurance measures; generate reports; analyze trends and set alerts; and manage compliance with data management guidelines that could be mandated by funded projects.

myO/EM controls who can do what and when with any subset of the data following user-provided guidelines. It queries questionable data entries. It maintains the integrity of the database (including chain of custody, keeping records of edits and reversing unwarranted changes. And it also makes sure that data does not go out the door as people come and go. It maintains professional backups.

Quality assurance (QA) is at the heart of any project, and particularly so when facing a broad diversity in user skills. There are several types of errors that we cover in myO/EM: grammatical errors, physical plausibility errors, and out of range errors.  Grammatical errors mean, for example, writing 1.a3 instead of 1.03. QA means catching this error and alerting the parties entering the data and those managing it. Errors in physical plausibility means entering values outside of the range of acceptable values for a certain parameter. This could mean, for example, entering 17 for pH or a negative value for rainfall. Identifying data that’s out of range means identifying trends in data and detecting data entries that could be potentially, out of range, but could also be real. When you have multiple organizations involved, a trickle of data entry errors could become an avalanche, so it must be controlled at the source. Questionable data entries must be identified at the source. The myO/EM platform has built-in automated QA analytical tools that would flag questionable data as soon as it is entered. This has significant cost savings: you do not want an army of data checkers poring over data the days before a major report is due, scrambling to come up with corrections.   

myO/EM assists users with implementing new functionality that could be used by the entire community of partners. This could include, for example, selecting technological solutions (on both the software and hardware sides). Technologies, whether developed in-house or imported, could be managed through myO/EM. A technology platform that is commonly used opens the door for negotiating favorable arrangements with external suppliers, anywhere from purchasing sensors to developing solutions for managing legacy data.  

In addition to supporting the common-core activities, the myO/EM tools mentioned above are also available to support the partners’ independent activities residing outside of the common core. For example, there could be a set of forms that could be used by all partners in support of common-core activities, and there could be others that are owned and used by any subset of the partners, one or more, with or without sharing. The independent and common-core work areas are firewalled.
Let us not forget what motivates such partnership: it is the large-scale data acquisition, and this means mobility and diversity of data acquisition modes. myO/EM supports a variety of data entry modes: manually, by sensors, using mobile devices, data files (spreadsheets, pictures, documents) uploads and direct lab feeds.

Finally, there is the issue of customization. No two projects are the same. There is always a need to customize the EIMS for the particular needs of a project. And that could easily be done with myO/EM.  

To summarize, myO/EM is a hybrid information management system that allows the partners to work together in the common core and separately and independently outside of the common core.
myO/EM is now available. Visit our site (myobservatory.org) or send me a note (yoram.rubin@webh2o.net). And, as a reminder, the single organization version of myObservatory is available for free for any non-profit CS. You can sign up on our website. 

Sunday, August 7, 2016

Challenges with Precision Agriculture: Finding the Balance Between Big Data and Local Conditions

Precision Agriculture (PA) is playing a major role in modernizing agriculture. PA usually means some sort of automated data collection (usually using sensors), often followed by an analysis (usually referred to as data mining) that rely on the historical data from multiple sites/farms. This analysis serves two goals: it provides manufacturers with better insight into their products, and it is also translated into recommendations. There is increasing need for farms to be as productive as possible, while also minimizing and mitigating any environmental impact of operations. Data mining, however, usually implies averages and correlations. Can these serve the goals stated above? There is a long history to prove the value of large-scale data mining, but it comes with a challenge. Although intended to do so, PA does not meet all the data needs of individual farms. There is a gap between multi-site data mining, on one hand, and local (farm-scale) data needs, on the other. Farming is much more than large-scale averages and correlations. Experienced farmers often have a “feel” for their land, which is difficult to translate into hard data that’s specific to a local site. It is said that “all politics is local”, and so is farming. Thus, data must be acquired and recommendations must be made based off of local farm scale data. Ideally, PA and local data can be coupled and complement each other in order to produce site-specific recommendations.

The global trends identified from data mining are important and helpful, but they neither intend nor are capable of addressing all the farm-scale challenges.  This is a well-known challenge in all large-sample statistics (from studies of human populations to spatial and environmental data): large-scale trends do not provide local-scale answers. For example in medicine, large scale pharmaceutical trials are obviously very important when considering population health, but they won’t tell you about the response of a particular individual to the proposed treatment. As a patient, you obviously care about the local-scale response, not just population averages.  Similarly, fish do not die of large-scale averages.

There is more to farming than a sensor in some soil. Farmers have their own ideas about what data to collect, where to collect it, and how to figure out which data is most useful.  This is true in many branches of agriculture. We can find an example for this challenge in an article by Noel Magnin, an agriculture expert, who commented on a well-known behavior in vineyards (LinkedIn, February 9, 2016): “Quality in wine grapes is due to some water constraint during the last steps of fruit maturation associated among other things with secondary metabolites production. Grape quality leads to wine quality and the best vintages occurred when water constraint is present and obviously when the maturation stages just before fruit collection have seen little to no rain. However, there are within vineyards locations where fruit quality never reaches a level high enough to result in high quality wine”. So, one can look at average crop quality parameters, trying to provide some general fixes, but it is the problem locations, where the soil becomes saturated at critical times, that would make the difference between poor, medium, and high quality grapes. You need to know where the trouble spots are, and large-scale averages won’t tell you that.   

Similarly, when managing cattle grazing operations, one can use some industry-based averages about grazing times and non-grazing intervals, but it would be more beneficial to modify these averages to reflect local conditions. Data mining provides good prior knowledge, but that prior knowledge must be updated by local conditions. This means reconciling between prior knowledge (the averages) on one hand, and site-specific evidence on the other. To do that, farmers need to explore the history of their farming operations in pictures, notes, sensor data, lab data and more, all dated and geo-referenced - and they should be able to explore that data with ease. This is, of course, not a new idea. What’s missing, however, are the tools that will allow farmers to do that.  
Let’s take a look at a few examples.

1. Data at your fingertips

What we have here is an example on how data could be organized in myObservatory, a web-based information management and analysis systems. The blue areas are hand-drawn shapes representing parcels/blocks/paddocks of particular interest. Each of these shapes acts as a container for all related data. This could include pictures, reports, notes, sensor data, lab reports, etc.   This system provides seamless connectivity between desktop and smartphones. So, for example, pictures can be taken using smartphones, and these pictures are automatically uploaded onto the myObservatory platform and automatically linked to the coordinate where it was taken. Data could be imported from external data providers for added insight, or data can be fed in by third parties (e.g. from partnered labs offering soil analysis lab services). Once data has been loaded or collected, it may be analyzed with statistical analysis and charting tools, geospatial analysis tools, or shared with selected stakeholders.

All Data are access-controlled. The project or site admin can assign users an appropriate level of access, anywhere from public view to adding data only, all the way to read/write privileges for any or all data and adding new users.  


This example (courtesy Peter Traverse, Innisfree Farms, and myObservatory) shows time lapse photos showing one of the grazing areas in the farm. Want to know how long your herd grazed here? And how long it took for recovery to occur? Here it is, ready and available at your fingertips. Want to add notes? Want to link these images with lab data or with data from your groundwater wells or rainfall data? You can do it with a click of a button     

3. Story Viewer

The third example (courtesy Peter Traverse, Innisfree Farms, and myObservatory) tracks the movements of the herd. Each of the pictures was taken using a smartphone, which automatically geo-tagged and dated them. Once within transmission range, the picture is uploaded and stored in the farm’s database, ready and available for analysis. Notes can be added and auxiliary files may be attached.  You do not need a full-time photographer to take these pictures. You can have all your staff taking pictures with their smartphones, and then all these pictures would be automatically assembled and organized by myObservatory. All these pictures could be easily accessible and searchable by date and location.

In conclusion, farming is local, and farmers need a platform that will allow them to collect and explore their data with ease.  With Story Viewer, Timeline, and with easy access to data, you are ready to explore your data and make the decisions most suitable for your farm. For morre information, visit our website at my-observatory.com   

Tuesday, June 28, 2016

Environmental Sustainability is a Multi-Stakeholder Effort Driven by Common Goals, by Information, and by Analysis. How Do We Make it All Work together?

There is a science that defines sustainability. At times there may also be a legal framework to define the effort. To make sustainability happen, both the scientific foundation as well as the legal framework must be accompanied with a community effort. Community can be defined in many different ways, but what all definitions have in common is the need to secure diverse modes of participation, to provide a flexible organizational structure, and to provide a collaborative data environment that bring all elements together. Without a community behind it, any goal is difficult to achieve, despite legal or scientific support. In this short article I will summarize some of our experiences gained from developing a collaborative data environment called myObservatory (or in short, mO; see myobservatory.org), and from dialogues with our partners and users.       
            Let’s start with community.  Communities could vary in membership. A community may include a few cattle farmers in Argentina, or it could scale up all the way to national organizations of citizen scientists, like the ASPEA organization in Portugal (as discussed here), which monitors the health of the national river network with tens of thousands of kilometers and thousands of volunteers. Community can even be a global organization that supports sustainable agriculture like Savory Global. Communities both large and small need advanced collaboration tools. The level of sophistication should not depend on the scale of the institute. This is what guided our thinking in designing our collaborative data environment.           
How do we make communities, small or large, work together? There may be different factors to consider. Obviously, a compelling vision is needed in order to attract participation and to maintain enthusiasm.   To translate this into action, a collaborative data environment is needed, one that would allow the community to translate motivation into tangible products, and to do so, (1) it must provide the organization with flexibility in defining and accommodating user roles; (2) it must accommodate multiple and diverse modes of data entry, and (3) it should be able to maintain the credibility and integrity of the data collected, and (4) it should allow users to generate meaningful and exciting content. It must also be easy and pleasant to use, otherwise the tools themselves serve as a barrier to achieving the goal. Let’s take a look at some of these elements.  

User Roles. A collaborative data environment needs to accommodate a wide range of roles, such as administrators, technicians, consultants, citizen scientists, analysts, observers, and possibly others. A user role is not just a title. A user is defined by data access privileges, and by the options to create content and provide guidelines.   
Occasional users, outside of the core group of committed users, should also be accommodated. We would like to have a core group of professionals and volunteers, providing support on a regular basis. But we should also appreciate the occasional visitors who may want to inspect our work or make occasional contributions.  This is especially true when reports on hazards or special events are welcome and encouraged. People with smartphones could provide a huge source of actionable information, especially when immediate action is warranted. Imagine a user taking a picture of some environmental hazard using a smartphone, which, geotagged and dated, is then immediately wired to immediately become a part of the database and a GIS display, and generating some sort of response. This accessibility empowers communities to take charge over their environment, and could keep the entire group fired up and motivated.
Collaborative efforts in support of environmental sustainability operate in a multi-stakeholder environment. This requires careful planning. I mentioned already the strict user access controls intended, among other things, to protect privacy of data. Operating of the data environment by a private entity which is not subject to Freedom of Information Act (FOIA) could be very important, as many stakeholders may be willing to share data under limiting conditions that cannot be met by government agencies, thus preventing the fear of bad publicity from deterring remediation efforts.   
We also realized that not all of those who wish to use data as participants are equally motivated. What we found, however, is that in some cases users could be enticed to participate and become full-fledged contributors by being able to benefit from the collaborative data environment in ways that may be indirectly related to sustainability.  Drawing from our experience in a project focused on sustainable management of a groundwater basin in California, for example: here the list of stakeholders included state and local water agencies, as well as private well-owners and volunteers.  Private well owners proved to be a challenge, as they are not required, by California law, to provide data on their wells. However, we managed to make some progress here by providing benefits that are directly related to groundwater such as updates and alerts related to regional and local trends in groundwater levels. Additionally, we provided access to agriculture related functions that are indirectly related to groundwater (e.g., degree-days needed for pesticide management, tools for analysis of pumping tests). For this, we implemented a wide range of analytical and scientific tools, all mounted on a GIS platform. For example, consider this image which provides a snapshot of groundwater levels and flow directions at the Sonoma Creek Groundwater aquifer in California.  This map is generated in real-time from data provided by all the stakeholders. The green dots represent groundwater state-owned monitoring wells. The blue lines represent groundwater levels and the red arrows mark flow directions. Looking at consecutive snapshots like this, one could draw conclusions about trends. Well owners are particularly worried about the water level falling below critical elevations required by their pumps. Addressing this and similar concerns could be very useful in attracting participation.  

Multiple Data Entry Modes. Flexibility in defining metrics for sustainability requires having multiple modes of data entry. This could include manual data entry into specially designed forms, file uploads and sensor feeds. All data should be geo-tagged and dated. Editors should be allowed to fill in the blanks with any data that is not geo-tagged.
Smartphones are particularly useful for connecting with a large number of users in real-time. In mO, we view the smartphone as a vital component, and to accommodate it, we created seamless connectivity between smartphones and desktops. Our smartphone technology allows quick and seamless assimilation of data. Data transmitted via smartphones includes pictures, notes, and filled-in forms, and it becomes actionable as soon as it is being transmitted and displayed, which could take a fraction of a second. For example, in our Natuf Project in the Middle East, users record information on new environmental hazards by filling in specially-designated forms and taking notes and pictures. This information is transmitted and as soon as received it is used to update a hazard map in real-time. This map is then processed together with a vulnerability map using a built-in algorithm, producing an updated risk map. This process is demonstrated in the Figure below. The vulnerability map represents local conditions (soil, vegetation, water resources, land use, depth to groundwater, and others). The map at the center represents recorded hazards. The multicolored icons mark where risk were reported. These icons are clickable, to reveal all the relevant information. Each icon point contains a description of the pollution hazard and perhaps even a picture, which can be viewed on top of the risk map to understand what is causing risk in an area.

The left and center figures above represent intermediate output maps, generated by interpolation from point sources of hazard and risk data. These two maps are then processed to produce the risk map, shown on the right. The risk map updates daily. It is a vital tool for maintaining the sustainability of the underlying aquifer. This entire process is executed using scientific modules embedded in mO. The complete user-smartphone-desktop process is shown and discussed here

            Maintaining Data Integrity. I am not talking here about data quality. This is a separate issue, which can be addressed in different ways, e.g., by training and verification. What integrity means is adhering to well-known practices of data custody. That requires maintaining strict user access controls and maintenance of a chain of custody.  Site administrators should be given the necessary tools to maintain the integrity of the data, and required to follow strict data protocols governing data and what can be done with it.

Content and Analytics: Making Sense of Data. This is the most rewarding component of the myObservatory collaborative data platform. Ideally one should just be able to analyze the data as it flows in, identify trends, generate or receive alerts, communicate with stakeholders, compare results across regions, identify data needs, and manage volunteers, all by clicking on an icon (preferably on a map). To make this happen, we implemented analytical tools that are used universally. We also realize that different organizations have different needs, and so we have the option to implement project-specific or even proprietary tools.
Summary. What I presented here is a short summary of an environmental information management system, tailored for the need of sustainability. It was developed by a group of engineers, programmers and sustainability experts. It evolved over many years of experience and user interaction. Give it a try and let me know what you think so that we could improve our technology. Get in touch with myObservatory if you need any assistance getting started, or want to discuss using it as a potential solution to a problem. Or, just post a comment or a question here!

Tuesday, May 31, 2016

myObservatory goes to school

Last week I went and met with eight classes of eighth graders at the Sandwich STEM Academy to introduce them to myObservatory.  Later this week and again next week I will be accompanying them on field trips to collect data at two different beaches.  Our in-class lesson focused on how to orient yourself with a map and how to interpret photo data.

Students were really interested in the map activities.  I had them try to locate a landmark on a zoomed in map of our region.  I was amazed that more students could not identify their school or the nearby waterway of Scortons Creek.  I look forward to using this experience to design a map orientation activity for future students.

Another myObservatory tool that we used is the newly added "Story Viewer" feature.  The story viewer allows you to see a photo on one side and the map location of that photo on the other side.  It was especially fun to talk with students about tide using these images.  In the example below notice that the map view shows the marsh at low tide.  Students observed the empty drainage ditches and sand bars in the creek.  And the photo on the left shows the location at high tide where ditches are barely visible.  What a fun way to connect ideas!

The eighth graders were all signed up for a myObservatory account.  When we are in the field they will be able to use their own login account to record data from the trip.  I am looking forward to spending a few data days at the beach!

Friday, May 6, 2016

David Vs. Goliath

In the world of GIS-based data management systems myObservatory is the David. There is a Goliath out there. And they are very good at what they do. 
The other day I was strolling among the booths in a major international conference. I saw Goliath’s booth. There were about 5 sales people there. They were very nice and informative and helpful. But it made me think: who is paying for all this? Obviously, it is the client who pays for this. When you buy Goliath’s product, you pay for a lot of very expensive overhead. We do not have a fancy headquarters. We do not have a network of offices all over the world. We rarely participate in conferences. 

Goliath is extraordinary powerful equipment with a lot of tools in their arsenal. It struck me that many of the tools will only be used by the giants: major corporations, government agencies, etc.  We have many of the tools that Goliath has, but not all. The question is, when you purchase from Goliath, do you need to pay for the tools the giants use? For example, when is the last time you used a geostatistical package? In fact, why should you pay for it when there are many geostatistics packages out there that are free (e.g., R has a wonderful geostatistics package).  Do you need to pay for the development of expensive, basin-scale hydrological response models that, in all likelihood, you will never use? When you do, will you will need consultants who are already paying for all this or have in-house, proprietary tools?) As an hydrologist, I can assure you, there is no standard, plug-and-play, hydrological model. There is an appropriate time to turn to Goliath, but for a large number of casual GIS users and professionals who work in a specific subject area, Goliath is an overkill. 
So, a few questions to you, our reader:

If you are a professional (teacher, environmentalist, architect, engineer, park ranger, or citizen scientist), and not part of a mega-organization: do you need to pay for Goliath? Desktop users have a variety of free or low-cost tools to choose from for desktop-based data creation, and myObservatory could serve you well for your data sharing, management, and web access needs. myObservatory Mobile could serve you well for in-field data collection.

If you are a traveler or an explorer, wishing to document your travel with a nice, very well organized set of GPS-tagged and dated pictures with notes, do you need to pay for Goliath? We could do that for you, for a monthly price of a coffee and doughnut. 

If you wish to visualize your data, see a timeline, or generate charts of your data, do you need Goliath? This is easy to do on myObservatory. 

If you are a farmer, wanting to document your activities, to archive pictures of your fields so that you can remind yourself how your field looked before and after and over as many years as you have data, do you need to pay for sales offices all over the world? 

Want to have a seamless mobile-web connectivity? We have it. Why pay for sending sales people to conferences?  

Need to share data with all your stakeholders with differentiated access privileges? We can do it for you. No need to pay the salary of a CEO and c-suite officers, and of dozens of thought leaders. 
My suggestion: myObservatory is tailored for your world. Pay for the tools you need, at the price you can afford. Really. Check out here: http://www.my-observatory.com/single-project-plans
Write to me at yoram.rubin@webh2o.net, and I will convince you we can do all that.. and more. 
And if you managed to read all the way to here: you deserve a shot at our free myObservatory Mobile. Download it and have fun

Tuesday, May 3, 2016

Sensor Makers at the Faire

This Saturday I accompanied two students, Cam and Michael, from Sandwich STEM Academy to the Cape Cod Mini Maker Faire at Mashpee High School.  It was my first time at the event, now in its third year.  I was thoroughly impressed by the number of Makers present and the breadth of their makings.  It was great to see our students represent STEM Academy and myObservatory with their Arduino based weather sensors that they are building for the Scortons Creek Project.

myObservatory purchased the parts for students to build their own sensors.  We are using the experience of our first students to create a teaching and learning module on sensors and data collection.  I visited STEM Academy four times over the last two months to work with students during engineering class.  The two students chosen for this special assignment, Cam and Michael, have thrown themselves into the project whole-heartedly.  They work hard during our visits and are interested in how they can help teach other students how to solder and build sensors.  They both spent their recent April school vacation soldering and coding to prepare for this Faire.

Attending Saturday's event was a milestone for the team.  The boys brought two built sensors with some minor sketch issues to the Maker Faire.  They fixed the original prototype and got it working while they were there.  They shared their experience with visitors stopping by their table.  They even got to look at code with a fellow Arduino fan.  More than anything it was nice to see them blend with all of the other student and adult makers present at the event.

Over the next few weeks the sensors will be installed out at Scortons Creek where they will start collecting weather data.  This data will be uploaded into myObservatory.  It will become a part of the shellfish feasibility study that is being done in collaboration with the Sandwich Department of Natural Resources and the STEM Academy.  I look forward to sharing myObservatory with the entire 8th grade later in May.  Hopefully by then we will have some data collected by sensors built by the students' classmates.

It is great to watch the process of Michael and Cam's build.  Listening to them work together and watching their constructive learning experience is a teachers dream!  I feel lucky to have the opportunity to work with such talented and motivated students.  I have learned so much from watching them...I almost feel like I could build a sensor myself (*almost*). This is community-based, project-based, inquiry-based learning at its best!

Monday, April 25, 2016

Observations of myself: Writer's Block

I have been trying and meaning to blog for the last 20 days...WOW!  That really got away from me!  For someone with so many thoughts, who regularly manages to post silly photos of her kids and dogs on Facebook, it is amazing to experience writers block.  And with each passing day I get farther and farther away from the ideas I meant to blog about.  Today is the day to end all that and get back to business.

It's not that I haven't made any observations.  I have.  I have been feeling the warmth of spring sunshine on my skin and smelling the first scents of flowering trees and hearing the peals of laughter from two wild boys unrestricted by winter layers.  I have been watching the slow march of buds that signals the end of a long a dark winter and the beginning of another summer.

It's not that I haven't used myObservatory.  I have.  I collected some great new photo vantage points of Scortons Creek from the middle of the marsh.  I worked with students to build Arduino based sensors to install out at Scortons Creek.  I contacted several interesting people with interesting projects from all over the country that I met in Nashville.  And I continue to collect my own personal data of my yard and my adventures with the aforementioned wild boys.  

So what's my deal?!  

I have been making excuses for 20 days.  I am too busy with other work.  I want to spend more time with my kids.  I hurt my back.  My husband had surgery.  I need to clean my house.

But the observations of myself that seem most true are just this: It's spring!  It's getting too nice to be stuck inside writing on my computer.  I needed a break to read a book for fun (yup, that's right I did!) and lay out sunning myself reptilian-style.  I have been grinding away the bad weather and cold by working myself harder than ever; I am tired of working.  I am not motivated; I am too distracted by the lengthening days and warmer nights.  I feel change in the air and could not reconcile what this change means in my own life.

But now I have come clean.  I have admitted to myself that my observations of me are just as important as my observations of the world around me.  I'm glad this is over with and I can start writing again!  I wanted to feel like coming back to writing was natural and not forced.  I needed to give myself a break and let my brain come back around.  I did all that.  

Here I am, BEWARE of random data stories and exciting new progress for myObservatory!