function [cp_occurrence_confidence,change_point,temporal_confidence_window]=change_point_analysis_formal_method(input_data,num_bootstrap_runs,num_permutations,cp_ID_method,bootstrap_style) %used to a) test the hypothesis that there is a real change point in the data, and get confidence intervals for the occurrence, b) determine when the cp occured, and c) get temporal confidence windows %start with permutation of the measured data ('shuffling'), find the cp of the shuffled data then the cp of each bootstrap run on that shuffled data. record these cp values for each shuffle. %find the cp for the unshuffled data and the cp for each bootstap run of the unshuffled data, record these cp values. %see, in percentile terms, how the variance of the cp values of the bootstrapped, unshuffled data compares to the variance of the cp values of the bootstrapped, shuffled data -- this is the confidence that there was really a change %if a cp really occurred, when it occurred and the temporal confidence window are given by the cp measurement on the unshuffled data and the range of the cp measurements for the bootstraps of the unshuffled data, respectively %implementations: %actual identification of the change point can be done by minimized SSE or by a method that uses the cumulative sum of deviations from the mean to see when the walk furthest below the mean occurred %bootstrap can be done the basic way with resampling, by using residuals with resampling, or by 'balanced resampling' (davison and hinckley) %checked aug 7/2007, rechecked sep 28 with addition of +1 in last (noncomment) line %called by analyze_across_sd_stdp_expts %calls shuffle_randomization, find_change_point_using_sum_of_squares,find_change_point_using_trend_method, basic_bootstrap_dwn,residual_bootstrap_dwn,balanced_bootstrap_dwn %tic len=length(input_data); range_of_changepoints=zeros(num_bootstrap_runs,num_permutations+1); for permute=1:num_permutations+1 %the last run is not shuffled and the measured change point and temporal confidence interval for this data are returned as the true values if permute<=num_permutations datavec=shuffle_randomization(input_data); else datavec=input_data; end if cp_ID_method==1 [change_point,smallest_sse]=find_change_point_using_sum_of_squares(datavec);%smallest_sse is not used here elseif cp_ID_method==2 [change_point,change_magnitude]=find_change_point_using_trend_method(datavec);%changemagnitude is not used end if bootstrap_style==1 temporal_change_points=basic_bootstrap_dwn(datavec,num_bootstrap_runs,len,cp_ID_method); elseif bootstrap_style==2 temporal_change_points=residual_bootstrap_dwn(datavec,num_bootstrap_runs,len,cp_ID_method,change_point); elseif bootstrap_style==3 temporal_change_points=balanced_bootstrap_dwn(datavec,num_bootstrap_runs,len,cp_ID_method); end range_of_changepoints(:,permute)=temporal_change_points'; end rocp=std(range_of_changepoints); rocpsort=sort(rocp(1:num_permutations)); cp_occurrence_confidence=100*(1-length(find(rocp(end)>rocpsort))/num_permutations); temporal_change_points=sort(temporal_change_points); %using the last temporal_change_points, the ones found by bootstrap of the *unshuffled* data cutoff=ceil(.025*num_bootstrap_runs); %choose num_bootstrap_runs s.t. no rounding is needed here temporal_confidence_window=[temporal_change_points(cutoff+1) temporal_change_points(end-cutoff)]; %the plus one would seem necessary to ignore that 5% of values that are most unlikely %toc %------------this was work towards a way to test the null hypothesis of no change point using likelihood ratio test % ordinal_vecs_for_h0=sort(ceil(rand(num_bootstrap_runs,len)*len)')'; %takes 'sampling with replacement' to mean any value from the input data has equal chance of being called any # of times % h0_SSEs=zeros(1,num_bootstrap_runs); % for bootstraps_for_h0_likelihood_ratio_test=1:num_bootstrap_runs % h0_pseudodata=datavec(1,ordinal_vecs_for_h0(bootstraps_for_h0_likelihood_ratio_test,:)); % h0_SSEs(bootstraps_for_h0_likelihood_ratio_test)=sum((h0_pseudodata-mean(h0_pseudodata)).^2); % end % h0_SSEs=sort(h0_SSEs); % %figure,plot(h0_SSEs) % % ordinal_vecs_for_h1=sort(ceil(rand(num_bootstrap_runs,len)*len)')'; %takes 'sampling with replacement' to mean any value from the input data has equal chance of being called any # of times % h1_SSEs=zeros(1,num_bootstrap_runs); % for bootstraps_for_h1_likelihood_ratio_test=1:num_bootstrap_runs % if cp_ID_method==1 % [cp,smallest_sse_h1]=find_change_point_using_sum_of_squares(datavec(1,ordinal_vecs_for_h1(bootstraps_for_h1_likelihood_ratio_test,:))); %change point is not used here % elseif cp_ID_method==2 % [cp,changemagnitude]=find_change_point_using_trend_method(datavec(1,ordinal_vecs_for_h1(bootstraps_for_h1_likelihood_ratio_test,:))); % smallest_sse_h1=sum((datavec(1:cp)-mean(datavec(1:cp))).^2)+sum((datavec((cp+1):end)-mean(datavec((cp+1):end))).^2); % end % h1_SSEs(bootstraps_for_h1_likelihood_ratio_test)=smallest_sse_h1; % end % h1_SSEs=sort(h1_SSEs); % %figure,plot(h1_SSEs) % % % %LIKELIHOOD OF ONE MEAN V LIKELIHOOD OF TWO MEANS % % likelihood of one mean=L(theta_onemean|observed SSEs from the bootstrap pool with one mean) % % likelihood of two means=L(theta_twomeans... % %chgpoint_existence_confirmation=42